Loading of Quantized Model

I have quantized model and I want to load it in pytorch but I am not able to do it.
After quantisation the definition of model is changing as fusion of BatchNormalization layer is happening.
But when I am loading the model I have previous definition which does not contain fused layer but other layers are there like quant and dequant layer.

Is there a way to load quantized model in pytorch?

Hi Mohit,
Can you provide more details/code? You can load/save quantized models by saving a state_dict(). When you perform fusion, make sure you set inplace=True.

Hey @raghuramank100 I have saved the model correctly but I want to use it in pytorch so we must know the definition of model then we can load the state_dict from the saved model file.
But what I have is definition of model without the fusion of layer and that’s where the definition of model changing and I can’t load model.

So Do I need to change the model definition according to the fused layers?

I think the expectation is to have the original model and go through the whole eager mode quantization flow again, and then load from the saved state_dict.

2 Likes

Hi mohit7,
Make sure you create the net using previous definition, and let the net go through process that was applied during quantization before (prepare_model, fuse_model, and convert), without rerun the calibration process.
After that you can load the quantized state_dict in. Hope it helps.

3 Likes