Hi @jcaip,
Thanks for your reply. I didn’t do anything unusual, the static quantization code is as follows:
def quantize_model(model):
# define calibration function
def calibrate(model):
model.eval()
with torch.no_grad():
model(input)
model_to_quantize = copy.deepcopy(model)
model_to_quantize.eval()
# prepare
backend = 'qnnpack'
qconfig_mapping = get_default_qconfig_mapping(backend)
qconfig_mapping.set_module_name("some_module", None)
example_inputs = torch.rand(input_size)
model_prepared = prepare_fx(model_to_quantize, qconfig_mapping, example_inputs)
calibrate(model_prepared)
model_quantized = convert_fx(model_prepared)
return model_quantized
And an excerpt of the output model is shown below. Only GroupNorm hasn’t changed anything.:
GraphModule(
(0): QuantizedConv2d(4, 4, kernel_size=(1, 1), stride=(1, 1), scale=0.08744074404239655, zero_point=170)
(1): Module(
(conv_in): QuantizedConv2d(4, 512, kernel_size=(3, 3), stride=(1, 1), scale=0.0658976137638092, zero_point=135, padding=(1, 1))
(mid_block): Module(
(resnets): Module(
(0): Module(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): QuantizedConv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), scale=1.9838380813598633, zero_point=207, padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): QuantizedDropout(p=0.0, inplace=False)
(conv2): QuantizedConv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), scale=0.08311469107866287, zero_point=94, padding=(1, 1))
)
(1): Module(
...
On another note, after converting this model to TorchScript, it works correctly on the desktop, but when I run it on Android, the output is wrong. This may not be related to this issue, but if you have any notes on how to run it on Android, I would appreciate it if you could let me know.
I am also concerned that the QNNPACK github page mentions GroupNormarlization is not covered. In this case, is it achieved by a combination of other primitive processes?
Thanks for your help!