Static Quantizatied model

Hello

i wanna make a quantization model, so i followed FX GRAPH MODE POST TRAINING STATIC QUANTIZATION tutorial on my resnet50 model.
so I checked my model weight capacity reduced.

here is my question.

  1. After quantizing the trained model, do I need to convert it into int8 and put it in when inference?
  2. is my quantized model weight type int8?

Short answer:

  1. No
  2. Yes

After quantization, the weight of the quantized model has been converted int8 already.

You can check it by quantized_model.layer_name.weight().dtype.
More details can be can found here Debugging Quantized Model.

1 Like

Thanks a lot.
Really helpful!!

So, in Q1, don’t i have to change my input data type fp32 to int8?

Yes, PyTorch will do it automatically in Fx mode.

Really helpful for my works.
Thanks!!!