Hi.
I’m trying to use Pytorch’s quantization scheme.
I’d like to quantize only weight with fake-quantization(QAT), not activation.
I tried this:
import torch.quantization as Q
model = load_model(my_config) # currently I'm using resnet architecture
qat_model = Q.fuse_modules(model, my_modules_to_fuse)
qat_model = Q.Qconfig(activation=Q.NoopObserver, weight=Q.FakeQuantize)
and this process from pytorch quantization tutorial
for nepoch in range(8):
train_one_epoch(qat_model, criterion, optimizer, data_loader, torch.device('cpu'), num_train_batches)
if nepoch > 3:
# Freeze quantizer parameters
qat_model.apply(torch.quantization.disable_observer)
if nepoch > 2:
# Freeze batch norm mean and variance estimates
qat_model.apply(torch.nn.intrinsic.qat.freeze_bn_stats)
# Check the accuracy after each epoch
quantized_model = torch.quantization.convert(qat_model.eval(), inplace=False)
quantized_model.eval()
top1, top5 = evaluate(quantized_model,criterion, data_loader_test, neval_batches=num_eval_batches)
print('Epoch %d :Evaluation accuracy on %d images, %2.2f'%(nepoch, num_eval_batches * eval_batch_size, top1.avg))
But the program gives this error:
calculate_qparams should not be called for NoopObserver
the reason I why used NoopObserver is avoiding calculate_qparams for activation… but It’s confused result.
How to solve this problem? any suggestion will be appreciated.
Thanks.