How does Quantization convert method

I have successfully called properly. However, I thought the converted model would transform my nn.layer and nn.Conv2d with the weight as int8 datatype. Instead, my layers are QuantizeLinear and QuantizeConv.

see Quantization — PyTorch 1.12 documentation, in particular,

Convert the observed model to a quantized model. This does several things:

quantizes the weights, computes and stores the scale and bias value to be

used with each activation tensor, and replaces key operators with quantized


model_int8 = torch.quantization.convert(model_fp32_prepared)