Steps to Create Quantized Model

mohit7 · December 13, 2019, 5:56am

Can anyone give the details and steps need to do quantization.
As there is some confusion - for MobilenetV2 example there is different ways and in the other example there is different example.
Is there some general way?

raghuramank100 · December 13, 2019, 11:09pm

Hi Mohit,
Are you interested in quantizing CV models? Are you targeting mobile devices?
If so, you can follow the tutorial here: https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html

In general, you can start with post training quantization (the available options depend on whether it is server or mobile). You can get the recommended qconfig to use for post training quantization by calling:

# Gets the recommended qconfig for post training quantization
model.qconfig = torch.quantization.get_default_qconfig('fbgemm') #'fbgemm' for server and 'qnnpack' for mobile
#Also, remember to set your backend engine to match what you use here:
torch.backends.quantization.engine = 'fbgemm'

If the accuracy is not good enough, you will have to do quantization aware training, which is more involved. You can see reference code at: https://github.com/pytorch/vision/blob/master/references/classification/train_quantization.py

mohit7 · December 16, 2019, 11:44am

Hey @raghuramank100 , instead of giving wrong answers don’t give the answers. It is misguiding to others. The API you mentioned doesn’t even exist. I wasted my 2 hours on this.
Thanks,
Mohit Ranawat

raghuramank100 · December 16, 2019, 7:47pm

Hi Mohit,
Looks like I made a mistake in the instructions. This should work:

qconfig = torch.quantization.get_default_qconfig('fbgemm')
print(torch.backends.quantized.supported_engines) # Prints the quantized backends that are supported
# Set the backend to what is needed. This needs to be consistent with the option you used to select the qconfig
torch.backends.quantized.engine='fbgemm'