I have quantized resenet50, quntize_per_channel_resent50 model is giving good accuracy same as floating-point. If I do torch jit save then I can load torch jit load. and do the inference.
How can I use a torch.save and torch.load model on a quantized model?
Will the entire state dict have same scale and zero points?
How can I get each layer scale and zero points from the quantized model?