Unconverted GroupNorm with FX Graph Mode Quantization

kmitsunami · November 23, 2023, 5:38pm

When I performed static post-training quantization (PTQ) on my model using the FX Graph mode quantization API, it seems that only GroupNorm in the model was not converted to QuantizedGroupNorm. QuantizedGroupNorm seems to exist as torch.ao.nn.quantized.GroupNorm, but GroupNorm is not converted to this. Is this operation supported?

torch version: 2.1.0

jcaip · November 30, 2023, 3:41pm

Hi @kmitsunami

Yes I believe that GroupNorm should be supported for static PTQ. Can you share the code you are using to do static quantization?

kmitsunami · November 30, 2023, 5:32pm

Hi @jcaip,

Thanks for your reply. I didn’t do anything unusual, the static quantization code is as follows:

def quantize_model(model):

    # define calibration function
    def calibrate(model):
        model.eval()
        with torch.no_grad():
            model(input)

    model_to_quantize = copy.deepcopy(model)
    model_to_quantize.eval()

    # prepare
    backend = 'qnnpack'
    qconfig_mapping = get_default_qconfig_mapping(backend)
    qconfig_mapping.set_module_name("some_module", None)
    example_inputs = torch.rand(input_size)
    model_prepared = prepare_fx(model_to_quantize, qconfig_mapping, example_inputs)

    calibrate(model_prepared)

    model_quantized = convert_fx(model_prepared)

    return model_quantized

And an excerpt of the output model is shown below. Only GroupNorm hasn’t changed anything.:

GraphModule(
  (0): QuantizedConv2d(4, 4, kernel_size=(1, 1), stride=(1, 1), scale=0.08744074404239655, zero_point=170)
  (1): Module(
    (conv_in): QuantizedConv2d(4, 512, kernel_size=(3, 3), stride=(1, 1), scale=0.0658976137638092, zero_point=135, padding=(1, 1))
    (mid_block): Module(
      (resnets): Module(
        (0): Module(
          (norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
          (conv1): QuantizedConv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), scale=1.9838380813598633, zero_point=207, padding=(1, 1))
          (norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
          (dropout): QuantizedDropout(p=0.0, inplace=False)
          (conv2): QuantizedConv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), scale=0.08311469107866287, zero_point=94, padding=(1, 1))
        )
        (1): Module(
         ...

On another note, after converting this model to TorchScript, it works correctly on the desktop, but when I run it on Android, the output is wrong. This may not be related to this issue, but if you have any notes on how to run it on Android, I would appreciate it if you could let me know.

I am also concerned that the QNNPACK github page mentions GroupNormarlization is not covered. In this case, is it achieved by a combination of other primitive processes?

Thanks for your help!

jerryzh168 · December 1, 2023, 11:54pm

we only support functional.group_norm currently in the flow I think: https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/backend_config/_common_operator_config_utils.py#L420, although it is supported as a quantized module: https://github.com/pytorch/pytorch/blob/main/torch/ao/nn/quantized/modules/normalization.py#L46

kmitsunami · December 6, 2023, 4:44pm

Hi @jerryzh168, thanks for your reply.

I’m having some difficulty understanding what you are saying. Does this mean that Quantized GroupNorm is available, but it cannot be used in QNNPACK, and thus, it won’t work on Android? What are the possible solutions if I want to run it on Android?

HDCharles · December 7, 2023, 10:00pm

i think he’s saying the module version of groupnorm isn’t supported but the functional one is

kmitsunami · December 8, 2023, 3:23pm

Ah, I see. So if I use functional.group_norm in the model description rather than the module one, it should be converted QuantizedGroupNorm? @jerryzh168

Thank you, @HDCharles.

jerryzh168 · December 15, 2023, 10:12pm

no, QuantizedGroupNorm is the result of quantizing (nn.GroupNorm). if you use F.group_norm, you should get this op: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/quantized/library.cpp#L143 (torch.ops.quantized.group_norm) I think

kmitsunami · December 16, 2023, 3:53pm

Okay, so, what would be the answer to my original question? My original issue is that even after quantizing a model containing nn.GroupNorm, it does not get converted to QuantizedGroupNorm. Why isn’t QuantizedGroupNorm used after quantization, and how can it be made to use QuantizedGroupNorm? I am not familiar with the codebase you shared. From a developer’s perspective, I would like to know how to correctly quantize the model.

jerryzh168 · December 18, 2023, 9:56pm

I see, I would recommend you to move to the new quantization flow (pt2 export quantization), since the old flow is in maintenance mode currently and adding new features/improvements there is not prioritized on our end. But the new quantization flow has more limited support for ops compared to the fx one.

It’s not too hard to extend fx graph mode quantization to support quantized GroupNorm module though (might be less than 10 lines), let me know if you want to give it a try.