How to quantize the weight not the activation for QAT

Rami_Ismael · September 26, 2022, 9:43pm

I was wondering how to use the QCofig object to quantize the weight only. However, for the activation at all. The purpose has been shown the quantize the only weight decrease the size of the model, and the difference in performance 32 bit to 8-bit model is much smaller.

jerryzh168 · September 27, 2022, 9:19pm

you can use weight only fake quant qconfig: pytorch/qconfig.py at master · pytorch/pytorch · GitHub, we don’t have kernel support for most of the weight only quantized ops though