Best practice or suggestion for QAT?

eleflea · September 25, 2020, 7:39am

I am curious about disable_observer and freeze_bn_stats in quantization aware training. I don’t know when should I apply them. I have tried different combinations of two parameters. It seems that has a big impact on accuracy. Is there any best practice for quantization aware training? Like should I disable observer first and when should I disable it, train from scratch or fine-tune a trained model?

Vasiliy_Kuznetsov · September 25, 2020, 3:45pm

hi @eleflea, check out https://github.com/pytorch/vision/blob/master/references/classification/train_quantization.py for one example. One approach which has proven to work well is:

start QAT training from a floating point pre-trained model and with observers and fake_quant enabled
after a couple of epochs, freeze the BN stats if your network has any BNs (epoch == 3 in the example)
after a couple of epochs, disable observers (epoch == 4 in the example)

eleflea · September 26, 2020, 1:54am

Thanks, I’ll try it.