Quantized Pytorch model exports to onnx

lyt91222 · February 25, 2022, 12:32am

Problem encountered when export quantized pytorch model to onnx. I have looked at this but still cannot get a solution.

When I run the following code, I got the error

“Tried to trace <torch.torch.classes.quantized.Conv2dPackedParamsBase object at 0x564a8bee7af0> but it is not part of the active trace. Modules that are called during a trace must be registered as submodules of the thing being traced.”

Anyone know how to fix this? Or this is an internal pytorch issue?

import torch
import torch.quantization


class M(torch.nn.Module):
    def __init__(self):
        super(M, self).__init__()
        self.quant = torch.quantization.QuantStub()
        self.conv = torch.nn.Conv2d(1, 1, 1)
        self.relu = torch.nn.ReLU()
        self.dequant = torch.quantization.DeQuantStub()

    def forward(self, x):
        x = self.quant(x)
        x = self.conv(x)
        x = self.relu(x)
        x = self.dequant(x)
        return x

# create a model instance
model_fp32 = M()
model_fp32.eval()
model_fp32.qconfig = torch.quantization.get_default_qconfig('fbgemm')
model_fp32_fused = torch.quantization.fuse_modules(model_fp32, [['conv', 'relu']])
model_fp32_prepared = torch.quantization.prepare(model_fp32_fused)
input_fp32 = torch.randn(4, 1, 4, 4)
model_fp32_prepared(input_fp32)
model_int8 = torch.quantization.convert(model_fp32_prepared)


output_x = model_int8(input_fp32)
#traced = torch.jit.trace(model_int8, (input_fp32,))

torch.onnx.export(model_int8,             # model being run
                    input_fp32,                         # model input (or a tuple for multiple inputs)
                    './model_int8.onnx',   # where to save the model (can be a file or file-like object)
                    export_params=True,        # store the trained parameter weights inside the model file
                    opset_version=11,          # the ONNX version to export the model to
                    #do_constant_folding=True,  # whether to execute constant folding for optimization
                    #input_names = ['input'],   # the model's input names
                    #output_names = ['output'], # the model's output names
                    #example_outputs=traced(input_fp32)
                    )

ddang · February 25, 2022, 2:51pm

I think the error output may be a bit misleading. I’ll take a look at the backend. Did you see this post: ONNX export of quantized model - #17 by mhamdan?

lyt91222 · February 25, 2022, 6:54pm

Yeah. But I didn’t see that error when I change from ‘fbgemm’ to ‘qnnpack’.

The error I have seen is same as ZyrianovS post 24 and post 25

ddang · February 26, 2022, 1:02am

Hey. Sorry for the delay. Can you please add a jit tag to the op? We’re thinking this may not be a quantization issue and may actually be associated with jit.

Deepak_Ghimire1 · May 20, 2022, 3:36pm

Hi guys,

Conversion of Torchvision (v0.11) Int8 Quantized models to onnx produces the following error.

AttributeError: 'torch.dtype' object has no attribute 'detach'

Is it not supported yet?

timosy · July 3, 2022, 8:52am

I also would like to know the update on this error!

jerryzh168 · July 13, 2022, 8:52pm

Deepak_Ghimire1:

Conversion of Torchvision (v0.11) Int8 Quantized models to onnx produces the following error.
AttributeError: 'torch.dtype' object has no attribute 'detach'
Is it not supported yet?

we are not working on onnx support, please contact PoC from MS for help. cc @supriyar do you know the PoC from MS for onnx?