I trained my quantized network, I was able to export the weights and bais to perform the inference on my platform.
However, I cannot find the data obtained from the pytorch observer and I have to manually calculate the parameters to scale the outputs of each layer (conv layer and fully layer). How can I extract this information directly from pytorch and then query the observer?
Thanks.
You can use
scale, zero_point = observer.calculate_qparams()
Please see torch.quantization — PyTorch 1.7.0 documentation for more info.
Forgive me if I ask trivial things but I am a beginner, this is the code:
model_quantized = model
eval_batch_size = value_batch_size
model_quantized.eval()
model_quantized.qconfig = torch.quantization.get_default_qconfig('qnnpack')
torch.backends.quantized.engine = 'qnnpack'
torch.quantization.prepare(model_quantized, inplace=True)
evaluate(model_quantized, criterion, loader_train, neval_batches=num_calibration_batches)
torch.quantization.convert(model_quantized, inplace=True)
top1, top5 = evaluate(model_quantized, criterion, loader_valid, neval_batches=num_eval_batches)
temp_scale, temp_zero_point = torch.quantization.observer.MinMaxObserver.calculate_qparams()
but if i run it, i get:
Traceback (most recent call last):
File "main2D.py", line 785, in <module>
temp_scale, temp_zero_point = torch.quantization.observer.MinMaxObserver.calculate_qparams()
TypeError: calculate_qparams() missing 1 required positional argument: 'self'
Should i write the command differently or in a different place?
Thank you very much.
You’d want to inspect the observer instances. After this line of code,
evaluate(model_quantized, criterion, loader_train, neval_batches=num_calibration_batches)
if you print your model, you should see observer modules attached, with statistics. You can query these module instances. For example, if you have a model.conv1
, you can then call model.conv1.activation_post_process.calculate_qparams()
to get qparams for the activation of conv1
.