Looking at a quantized torchscript model (using jit.trace), is there a way to know if model was quantized for fbgemm/qnnpack backend? Reduce_range is another parameter which affects how the scales/offsets are interpreted by a “new” backend. Is there a way to know reduce_range was on or off at the time of quantization? This information can be node based on graph based.
Hi Saurabh,
If you print out the prepared model graph you should be able to see what observers were inserted. I don’t think there’s a way to check what backend the model was quantized on specifically, but you could check what the quant_min and quant_max values were set to in the observers this way. The reduce_range
attribute is now deprecated and replaced with quant_min
and quant_max
. Let me know if that helps.
Best,
-Andrew
Hi @andrewor,
I can work with quant_min/quant_max. But I can’t find that information either. I was specifically looking for an api which can get me this information programmatically.
Let’s consider the example below for pre-quantized model from torchvision-
import torchvision
import torch
model= torchvision.models.quantization.googlenet(pretrained=True, quantize=True)
print(model)
This prints
QuantizableGoogLeNet(
(conv1): QuantizableBasicConv2d(
(conv): QuantizedConvReLU2d(3, 64, kernel_size=(7, 7), stride=(2, 2), scale=0.08655554801225662, zero_point=0, padding=(3, 3))
(bn): Identity()
(relu): Identity()
)
(maxpool1): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
(conv2): QuantizableBasicConv2d(
(conv): QuantizedConvReLU2d(64, 64, kernel_size=(1, 1), stride=(1, 1), scale=0.05618245154619217, zero_point=0)
(bn): Identity()
(relu): Identity()
)
(conv3): QuantizableBasicConv2d(
(conv): QuantizedConvReLU2d(64, 192, kernel_size=(3, 3), stride=(1, 1), scale=0.05060958117246628, zero_point=0, padding=(1, 1))
(bn): Identity()
(relu): Identity()
)
It has no information about quant_min/quant_max or reduce_range or engine (assuming it was quantized with default setting for corresponding backend).
input_shape= [1, 3, 224, 224]
x = torch.randn(input_shape)
model.eval()
traced_mod = torch.jit.trace(model, x)
print(traced_mod)
With jit traced torchscript model a print doesn’t give scale/offset information either however there are other ways to get scale/offset but not sure about quant_min/max value or engine.
QuantizableGoogLeNet(
original_name=QuantizableGoogLeNet
(conv1): QuantizableBasicConv2d(
original_name=QuantizableBasicConv2d
(conv): RecursiveScriptModule(original_name=ConvReLU2d)
(bn): Identity(original_name=Identity)
(relu): Identity(original_name=Identity)
)