Eval the saved qat model get a different results with the results in training processing

shupinghu · March 22, 2022, 6:20am

Hey, I am new to Quantization Aware Training using PyTorch. I have been trting to train a qat model for my “rknn devices” so I use ‘qnnpack’ during the training and eval processing. During training processing, the eval result is map=0.4, and during the individual eval processing, the eval result is 0.3, but I used the same eval dataset and eval codes. When I trying to do the expriments without Quantization Aware Training (Float model), th eval results is equal in the training processing and the individual eval processing. I wonder what is wrong here?

The traing codes are as follows:
net = mymodel()
checkpoint = torch.load(float_model_path)
net.load_state_dict(net, checkpoint)
net.qconfig = torch.quantization.get_default_qat_qconfig(‘qnnpack’)
torch.quantization.prepare_qat(self, inplace=True)
net.train()
‘’’
traing
‘’’
torch.quantization.convert(net.eval().cpu(), inplace=True)
torch.save(net.cpu().state_dict(), qat_model_path)
evaluate(net.eval().cpu(), … )

The individual eval codes are as follows:
net = mymodel().eval()
net.qconfig = torch.quantization.get_default_qat_qconfig(‘qnnpack’)
torch.quantization.prepare_qat(self, inplace=True)
torch.quantization.convert(net, inplace=True)
checkpoint = torch.load(qat_model_path, map_location=‘cpu’)
net.load_state_dict(checkpoint)
evaluate(net, … )

Both of these codes are run in my server with CentOS system and V100 GPUs. pytorch version: torch 1.7.0

suraj.pt · March 23, 2022, 12:26am

Hi @shupinghu , welcome!

Why are you using torch.quantization.convert(net.eval().cpu(), inplace=True) instead of simply torch.quantization.convert(net, inplace=True)? net.eval().cpu() is a different object as net, so the inplace operation on net.eval().cpu() won’t actually change net

terry_chen · March 25, 2022, 8:20pm

what’s your evaluation quantization engine?
Try to print torch.backends.quantized.engine on eval and train mode to see if they are matched.