How to quantize the weight not the activation for QAT

I was wondering how to use the QCofig object to quantize the weight only. However, for the activation at all. The purpose has been shown the quantize the only weight decrease the size of the model, and the difference in performance 32 bit to 8-bit model is much smaller.

you can use weight only fake quant qconfig: pytorch/ at master · pytorch/pytorch · GitHub, we don’t have kernel support for most of the weight only quantized ops though