How to freeze the FakeQuantize zero_point during train

I need to train a quantized model which has 0 offset due to limitations of my inference framework.
I’m following the flow described in
So the model is prepared with prepare_qat which adds FakeQuantize layers
The problem is that both scale and zero_point are being trained. I need the zero_point to be fixed at 0.

The scale and zero_point aren’t trained - they are calculated by observers inserted in the network. You can implement an observer specific to your use-case which will fix the zero_point at 0. For reference the zero_point calculation happens in
Observers are set when you initialize the qconfig (in this case you seem to be using the default. i.e.

1 Like

Thanks for your input.

Do you know a good example of applying custom observers ?
qconfig = QConfig(activation=FakeQuantize.with_args(observer=,
Is this enough or setting a custom observer or there are some nuances ?

You can follow any of the observers defined in [] as a starting point.

To enable it in the qconfig you can do
FakeQuantize.with_args(observer=MyObserver, quant_min=0, quant_max=255, dtype=torch.qint8, qscheme=torch.per_tensor_affine)

1 Like