Share qparams in pt2e

plankp · July 28, 2023, 3:37pm

Hi, I’m following the QUANTIZATION IN PYTORCH 2.0 EXPORT TUTORIAL guide and the xnnpack-quantizer in the repo to create a new Quantizer. I am trying to have both inputs of add share the same qparams, just like the sample code provided in the tutorial.

It works when I just have add nodes, but as soon as a convolution node appears, the two inputs no longer have the same qparams. Here’s a simple model that illustrates this issue:

class Net(nn.Module):
  def __init__(self):
    super().__init__()
    self.conv = nn.Conv2d(1, 1, 3, padding=(1, 1))

  def forward(self, x):
    return x + self.conv(x)

Is there something that I am missing in my quantizer?
link to the quantizer: Quantizer for PT2E quantization · GitHub

(In this example, it’s sufficient to swap the operands of the add node, but once both operands come from different convolutions, the problem reappears)

supriyar · July 28, 2023, 4:02pm

thanks for reporting! FYI the PT2 export quantization is still in early prototype so there may be some cases that we may have missed or might need to update the tutorial to make it clearer.

cc @jerryzh168 to take a look

jerryzh168 · July 28, 2023, 5:29pm

thanks for trying out our new APIs! this can be supported by the current API I think, yeah swapping operands seems reasonable in this case, talking about the case when there are two conv_ops

basically let’s say we have:

 [conv1_op] -> conv1_output -> add_input1 -> [add_op]
 [conv2_op] -> conv2_output -> add_input2    /

so in total we have conv1_output, add_input1, conv2_output and add_input2 that we can annotate, the reason when you have annotate both [conv1_op], [conv2_op] and [add_op] the inputs don’t share quantization parameters is because the output of [conv1_op] and [conv2_op] (conv1_output and conv2_output) do not have shared quantization specs, so they will create different quantization parameters, what you can do to support this is something like this:

arg1, arg2 = add_node.args
# this means sharing with the output qspec of arg1 node
_share_with_arg1_output = SharedQuantizationSpec(arg1)
# set output of arg2 to share with arg1
if _is_annotated(arg1) and _is_annotated(arg2):
    arg2.meta["quantization_annotation"].output_qspec = _share_with_arg1_output

# all input of add should share with the same thing as well
add_node.meta["quantization_annotation"] = QuantizationAnnotation(
    input_qspec_map = {arg1: _share_with_arg1_output, arg2: _share_with_arg1_output}
   ...
}