Quantization post exporting to ONXX

Pytorch model converted to ONXX model and then I tried quantizing the ONXX model expecting faster inference. Unfortunately it didn’t. Any hints to proceed with quantization for an ONXX model would be really helpful.

Thanks in advance.

Hi Harish, did you mean ONNX models? Unfortunately exporting quantized models to ONNX is not an area the pytorch team is actively maintaining.