Quantize Model is bigger than the non quantized mdoel

Rami_Ismael · September 16, 2022, 2:09pm

I went through PyTorch Documentation for Quantize aware training. I prepare and then convert the model to quint 8 models. I save the model to use state_dict(). However, getting the file size using os.path.gets(). The model size is bigger than a non quantized model.

Zafar · September 19, 2022, 5:41pm

Can you try running a bigger model? Currently you have a weight with a single element. Even if you store the weight in int8, you still need 32 bits for the scale and 32 bits for the zero point. Try making a bigger model with multiple conv layers of more realistic sizes.