Questions about build customized quantizer

I’m trying to build a customized quantizer, which takes a FP32 model as input and then output a quantized model, basically I just need to quantize all convolution layers, and I got few questions.

  1. Should I replace all nn.Conv2d modules to nn.quantized.Conv2d modules?
  2. If I set a nn.Conv2d module’s weight to dtype torch.int8 (not torch.qint8) and input a tensor of dtype torch.int8 and remove bias, does it run like a nn.quantized.Conv2d?

are you talking about pt2e quantization? we have a guide here: How to Write a Quantizer for PyTorch 2 Export Quantization — PyTorch Tutorials 2.5.0+cu124 documentation

  1. you don’t need to replace nn.Conv2d modules in pt2e
  2. nn.quantized.Conv2d also has output quantization I think so just replacing input is not enough