In per-channel quantization for conv models (done following https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html), the scales and zero points of weights and intermediate outputs of the model are always 1 and 0, i.e. the range is same as in the float model. I have a custom module in between convbnrelu modules, which I want to quantize separately. But the quant and dequant stubs are only rounding off or float typecasting the outputs. Since there is no scale and zero point attached, I am not able to quantize my custom model properly. Is there any other way to quantize so I can get scales and zero points (the output/weight range is spread out to -127 to 128 or 0 to 255 rather than being same) for the output of the quantized convbnrelu layer?
scales and zero_points are 1 and 0 by default. Did you run calibration step, quantize step?