Static Quantizatied model

chowk1109 · February 7, 2023, 8:55am

Hello

i wanna make a quantization model, so i followed FX GRAPH MODE POST TRAINING STATIC QUANTIZATION tutorial on my resnet50 model.
so I checked my model weight capacity reduced.

here is my question.

After quantizing the trained model, do I need to convert it into int8 and put it in when inference?
is my quantized model weight type int8?

111357 · February 7, 2023, 1:38pm

Short answer:

No
Yes

After quantization, the weight of the quantized model has been converted int8 already.

You can check it by quantized_model.layer_name.weight().dtype.
More details can be can found here Debugging Quantized Model.

chowk1109 · February 7, 2023, 8:14pm

Thanks a lot.
Really helpful!!

So, in Q1, don’t i have to change my input data type fp32 to int8?

111357 · February 8, 2023, 12:50am

Yes, PyTorch will do it automatically in Fx mode.

chowk1109 · February 8, 2023, 1:01am

Really helpful for my works.
Thanks!!!