I have quantized resenet50, quntize_per_channel_resent50 model is giving good accuracy same as floating-point. If I do torch jit save then I can load torch jit load. and do the inference.
How can I use a torch.save and torch.load model on a quantized model?
Will the entire state dict have same scale and zero points?
How can I get each layer scale and zero points from the quantized model?
How can I use a torch.save and torch.load model on a quantized model?
Currently we only support torch.save(model.state_dict()) and model.load_state_dict(…) I think. torch.save/torch.load model directly is not yet supported I believe.
Will the entire state dict have same scale and zero points?
No, they’ll have scale/zero_point that’s calculated from the calibration step.
How can I get each layer scale and zero points from the quantized model?
you can print the quantized model and it will show scale and zero_point, e.g.:
Also, check if it is just the __repr__ that is not showing the info or are the quant params really missing – try getting the scale and zero_point directly.
Be sure you do the whole post training preparation process (by running layer fusion, torch.quantization.prepare() and torch.quantization.convert() ) before loading the state_dict.