I am curious about disable_observer
and freeze_bn_stats
in quantization aware training. I don’t know when should I apply them. I have tried different combinations of two parameters. It seems that has a big impact on accuracy. Is there any best practice for quantization aware training? Like should I disable observer first and when should I disable it, train from scratch or fine-tune a trained model?
hi @eleflea, check out https://github.com/pytorch/vision/blob/master/references/classification/train_quantization.py for one example. One approach which has proven to work well is:
- start QAT training from a floating point pre-trained model and with observers and fake_quant enabled
- after a couple of epochs, freeze the BN stats if your network has any BNs (epoch == 3 in the example)
- after a couple of epochs, disable observers (epoch == 4 in the example)
Thanks, I’ll try it.