How to save the quantized model?

I used linear quantization, but the quantized model’s size unchanged,It seems that ‘torch.save()’ still save weights in float format…
How to save the quantized weights? I am really appreciate your help.

1 Like

Have solve the problem? Or any idea to do quantization with pytorch?

no… I quantized the model to 2 bit but it is still save in 32bit

Can tou provide the github link to the code to allow us to help?

I have attempted this and am facing the same issues. I used the approach from the following repo:

When I try to save the model with torch.save the file size does not show any decrease.

Hi Richard - were you able to quantize your PyTorch models successfully?

With quantization support in pytorch 1.3, this should work if you follow the flow in the pytorch tutorials for quantization: https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html