Using quantized addition


There seems to be a quantized::add operator but I can’t find how to use it

class QuantAdd(torch.nn.Module):
    def __init__(self):
        super(QuantAdd, self).__init__()
        self.quant = torch.quantization.QuantStub()
        self.dequant = torch.quantization.DeQuantStub()

    def forward(self, x, y):
        x = self.quant(x)
        y = self.quant(y)
        out = x + y
        return self.dequant(out)

model = QuantAdd()

torch.backends.quantized.engine = 'qnnpack'
model.qconfig = torch.quantization.get_default_qconfig('qnnpack')
torch.quantization.prepare(model, inplace=True)

a = torch.rand(1, 3, 4, 4)
b = torch.rand(1, 3, 4, 4)

_ = model(a, b)

torch.quantization.convert(model, inplace=True)
traced_model = torch.jit.trace(model, (a, b))

The above will result in the error:

NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'QuantizedCPU' backend.

I tried using out.copy_(x + y) on a pre-allocated tensor but still get the ‘aten::empty.memory_format’ error which I guess is related to x + y return. So I tried in-place addition (both with x += y and x.add_(y) ) and I get:

NotImplementedError: Could not run 'aten::add.out' with arguments from the 'QuantizedCPU' backend.

Am I missing something obvious? What’s the right way to use quantized addition in a model?



Actually I think I found the way.
I need to use a torch.nn.quantized.FloatFunctional as a functor and then swap it by a torch.nn.quantized.QFunctional after model conversion but before tracing.

Is that the recommended way to proceed?
Or is there a more straightforward alternative?