Questions about build customized quantizer

cokespace2 · March 8, 2023, 2:32am

I’m trying to build a customized quantizer, which takes a FP32 model as input and then output a quantized model, basically I just need to quantize all convolution layers, and I got few questions.

Should I replace all nn.Conv2d modules to nn.quantized.Conv2d modules?
If I set a nn.Conv2d module’s weight to dtype torch.int8 (not torch.qint8) and input a tensor of dtype torch.int8 and remove bias, does it run like a nn.quantized.Conv2d?

jerryzh168 · October 23, 2024, 4:59am

are you talking about pt2e quantization? we have a guide here: How to Write a Quantizer for PyTorch 2 Export Quantization — PyTorch Tutorials 2.5.0+cu124 documentation

you don’t need to replace nn.Conv2d modules in pt2e
nn.quantized.Conv2d also has output quantization I think so just replacing input is not enough