How do I save and load quantization model

I have quantized resenet50, quntize_per_channel_resent50 model is giving good accuracy same as floating-point. If I do torch jit save then I can load torch jit load. and do the inference.

How can I use a torch.save and torch.load model on a quantized model?
Will the entire state dict have same scale and zero points?
How can I get each layer scale and zero points from the quantized model?

How can I use a torch.save and torch.load model on a quantized model?

Currently we only support torch.save(model.state_dict()) and model.load_state_dict(…) I think. torch.save/torch.load model directly is not yet supported I believe.

Will the entire state dict have same scale and zero points?

No, they’ll have scale/zero_point that’s calculated from the calibration step.

How can I get each layer scale and zero points from the quantized model?

you can print the quantized model and it will show scale and zero_point, e.g.:

> print(torch.nn.quantized.Conv2d(3, 3, 3))
QuantizedConv2d(3, 3, kernel_size=(3, 3), stride=(1, 1), scale=1.0, zero_point=0)
1 Like

Thank you @jerryzh168

I was able to save with model.state_dict() but not able to lad the model with same model.load_state_dict(). It was giving keyError.

Secondly if I save with torch.jit.save(torch.jit.script(pcqmodel),“quantization_per_channel_model.pth”)

I am not able to see the Quantization info after loading the model . Referred in this issue

are you using the most recent version? could you try again with PyTorch nightly builds?

Also, check if it is just the __repr__ that is not showing the info or are the quant params really missing – try getting the scale and zero_point directly.

Be sure you do the whole post training preparation process (by running layer fusion, torch.quantization.prepare() and torch.quantization.convert() ) before loading the state_dict.

1 Like