Incorrect results after loading saved quantized model

I have quantized and saved the model using this code -

model = Net()
model.load_state_dict(torch.load('model.pth', map_location=torch.device('cpu')))                                
model.qconfig = torch.quantization.default_qconfig              
torch.quantization.prepare(model, inplace=True)
torch.quantization.convert(model, inplace=True)
x = evaluate(model), 'model_q.pth')

and loading the model like this -

model2 = Net()
torch.qconfig = torch.quantization.default_qconfig              
torch.quantization.prepare(model2, inplace=True)
torch.quantization.convert(model2, inplace=True)
xQ = evaluate(model2)

Now x and xQ are different. I checked the parameters of both ‘model’ and ‘model2’. Parameters are same.

for i in range(len(list(model.parameters()))):
    print(np.equal(list(model.parameters())[i].detach().numpy(), list(model2.parameters())[i].detach().numpy()))

All parameters are equal.

Is there anything incorrect with my method of loading or saving? or some bug in PyTorch?

I compared output of all layers of original model and loaded model. I found output of BatchNorm2d was different.

The problems is Pytorch wasn’t saving ‘scale’ and ‘zero_point’ of unfused QuantizedBatchNorm in checkpoints. Two solutions -

  • Save these values as pickle when saving model. While loading model, load pickle and add scale and zero point to QuantizedBatchNorm layers.
  • Fuse BatchNorm with Convolution.
1 Like

hi @deepak_mangla, thanks for the report. I created to verify correct behavior. Please let us know if you have a repro on a toy model.