QAT model drops accuracy after converting with torch.ao.quantization.convert

du_tran_ngoc · April 28, 2025, 4:03am

Hello everyone.

I am implementing QAT model yolov8 in 4bit mode for weight and 8bit for activation by setting quant_min, quant_max in config. The model when training and eval gives quite good results, however when I convert using torch.ao.quantization.convert method, the model gives very bad evaluation results. Does anyone know how to solve this problem?

jerryzh168 · April 29, 2025, 12:36am

can you provide full code? I don’t think 4 bit quant is supported with torch.ao.quantization.convert

also this is going to be deprecated, we recommend to use Quantization — PyTorch main documentation instead