QAT model drops accuracy after converting with torch.ao.quantization.convert

Hello everyone.

I am implementing QAT model yolov8 in 4bit mode for weight and 8bit for activation by setting quant_min, quant_max in config. The model when training and eval gives quite good results, however when I convert using torch.ao.quantization.convert method, the model gives very bad evaluation results. Does anyone know how to solve this problem?

can you provide full code? I don’t think 4 bit quant is supported with torch.ao.quantization.convert

also this is going to be deprecated, we recommend to use Quantization — PyTorch main documentation instead