How does Quantization convert method

Rami_Ismael · September 13, 2022, 5:10pm

I have successfully called torch.ao.quantization.convert properly. However, I thought the converted model would transform my nn.layer and nn.Conv2d with the weight as int8 datatype. Instead, my layers are QuantizeLinear and QuantizeConv.

ddang · September 14, 2022, 12:31pm

see Quantization — PyTorch 2.1 documentation, in particular,

Convert the observed model to a quantized model. This does several things:

quantizes the weights, computes and stores the scale and bias value to be

used with each activation tensor, and replaces key operators with quantized

implementations.

model_int8 = torch.quantization.convert(model_fp32_prepared)