Unconverted GroupNorm with FX Graph Mode Quantization

When I performed static post-training quantization (PTQ) on my model using the FX Graph mode quantization API, it seems that only GroupNorm in the model was not converted to QuantizedGroupNorm. QuantizedGroupNorm seems to exist as torch.ao.nn.quantized.GroupNorm, but GroupNorm is not converted to this. Is this operation supported?

torch version: 2.1.0

Hi @kmitsunami

Yes I believe that GroupNorm should be supported for static PTQ. Can you share the code you are using to do static quantization?

Hi @jcaip,

Thanks for your reply. I didn’t do anything unusual, the static quantization code is as follows:

def quantize_model(model):

    # define calibration function
    def calibrate(model):
        model.eval()
        with torch.no_grad():
            model(input)

    model_to_quantize = copy.deepcopy(model)
    model_to_quantize.eval()

    # prepare
    backend = 'qnnpack'
    qconfig_mapping = get_default_qconfig_mapping(backend)
    qconfig_mapping.set_module_name("some_module", None)
    example_inputs = torch.rand(input_size)
    model_prepared = prepare_fx(model_to_quantize, qconfig_mapping, example_inputs)

    calibrate(model_prepared)

    model_quantized = convert_fx(model_prepared)

    return model_quantized

And an excerpt of the output model is shown below. Only GroupNorm hasn’t changed anything.:

GraphModule(
  (0): QuantizedConv2d(4, 4, kernel_size=(1, 1), stride=(1, 1), scale=0.08744074404239655, zero_point=170)
  (1): Module(
    (conv_in): QuantizedConv2d(4, 512, kernel_size=(3, 3), stride=(1, 1), scale=0.0658976137638092, zero_point=135, padding=(1, 1))
    (mid_block): Module(
      (resnets): Module(
        (0): Module(
          (norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
          (conv1): QuantizedConv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), scale=1.9838380813598633, zero_point=207, padding=(1, 1))
          (norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
          (dropout): QuantizedDropout(p=0.0, inplace=False)
          (conv2): QuantizedConv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), scale=0.08311469107866287, zero_point=94, padding=(1, 1))
        )
        (1): Module(
         ...

On another note, after converting this model to TorchScript, it works correctly on the desktop, but when I run it on Android, the output is wrong. This may not be related to this issue, but if you have any notes on how to run it on Android, I would appreciate it if you could let me know.

I am also concerned that the QNNPACK github page mentions GroupNormarlization is not covered. In this case, is it achieved by a combination of other primitive processes?

Thanks for your help!

we only support functional.group_norm currently in the flow I think: https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/backend_config/_common_operator_config_utils.py#L420, although it is supported as a quantized module: https://github.com/pytorch/pytorch/blob/main/torch/ao/nn/quantized/modules/normalization.py#L46

Hi @jerryzh168, thanks for your reply.

I’m having some difficulty understanding what you are saying. Does this mean that Quantized GroupNorm is available, but it cannot be used in QNNPACK, and thus, it won’t work on Android? What are the possible solutions if I want to run it on Android?

i think he’s saying the module version of groupnorm isn’t supported but the functional one is

Ah, I see. So if I use functional.group_norm in the model description rather than the module one, it should be converted QuantizedGroupNorm? @jerryzh168

Thank you, @HDCharles.

no, QuantizedGroupNorm is the result of quantizing (nn.GroupNorm). if you use F.group_norm, you should get this op: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/quantized/library.cpp#L143 (torch.ops.quantized.group_norm) I think

Okay, so, what would be the answer to my original question? My original issue is that even after quantizing a model containing nn.GroupNorm, it does not get converted to QuantizedGroupNorm. Why isn’t QuantizedGroupNorm used after quantization, and how can it be made to use QuantizedGroupNorm? I am not familiar with the codebase you shared. From a developer’s perspective, I would like to know how to correctly quantize the model.

I see, I would recommend you to move to the new quantization flow (pt2 export quantization), since the old flow is in maintenance mode currently and adding new features/improvements there is not prioritized on our end. But the new quantization flow has more limited support for ops compared to the fx one.

It’s not too hard to extend fx graph mode quantization to support quantized GroupNorm module though (might be less than 10 lines), let me know if you want to give it a try.