Onnx export failed int8 model

Hao_ZHANG · November 1, 2019, 8:30am

Is quantize_per_tensor not supported by ONNX? Will more ops(like PReLU) be supported by nn.quantized?

jerryzh168 · November 1, 2019, 6:04pm

It’s not yet supported, we are still figuring out the plan for quantization support in ONNX.

zif520 · January 19, 2020, 4:06am

pytorch1.4.0 is supported for quantized for onnx？

jerryzh168 · February 14, 2020, 6:37pm

@supriyar has tested the quantization in onnx with one of our internal models, but I’m not sure about the long term plans for that. @supriyar can you comment?

supriyar · February 22, 2020, 1:02am

The support that exists currently is for Pytorch -> ONNX -> Caffe2 path. The intermediate onnx operators contain references to the C2 ops so cannot be executed standalone in ONNX. See https://github.com/pytorch/pytorch/blob/master/torch/onnx/symbolic_caffe2.py for more info.

dassima · February 24, 2020, 2:48pm

Hi, I’ve read your answer, but I am confused. You need first an onnx model which you later convert to caffe2. But if I get an error when exporting to onnx, how I can get to second step?

jerryzh168 · February 24, 2020, 11:10pm

could you paste the error message?

dassima · February 25, 2020, 2:37pm

I installed the nightly version of Pytorch.

torch.quantization.convert(model, inplace=True)
torch.onnx.export(model, img, “8INTmodel.onnx”, verbose=True)


Traceback (most recent call last):
  File "check_conv_op.py", line 92, in <module>
    quantize(img)
  File "check_conv_op.py", line 59, in quantize
    torch.onnx.export(model, img, "8INTmodel.onnx", verbose=True)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/__init__.py", line 168, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 69, in export
    use_external_data_format=use_external_data_format)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 485, in _export
    fixed_batch_size=fixed_batch_size)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 334, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 282, in _trace_and_get_graph_from_model
    orig_state_dict_keys = _unique_state_dict(model).keys()
  File "/usr/local/lib/python3.7/site-packages/torch/jit/__init__.py", line 302, in _unique_state_dict
    filtered_dict[k] = v.detach()
AttributeError: 'torch.dtype' object has no attribute 'detach'

jerryzh168 · March 3, 2020, 3:07am

looks like it’s calling detach on a dtype object, could you paste check_conv_op.py?

G4V · July 14, 2020, 8:32am

Hi @dassima and @jerryzh168 - did you manage to get to the bottom of this? I’m seeing exactly the same error. A simple model exports fine without quantization.

Setting a break on the point of failure, I’m seeing the object to be detached is torch.qint8

Then dumping the state_dict for both non-quantized and quantized versions, the quantized version has this as an entry - (‘fc1._packed_params.dtype’, torch.qint8). The non quantized version has only tensors.

Any thoughts as to what’s going on greatly appreciated!

Thanks.

jerryzh168 · July 14, 2020, 4:55pm

it’s probably because of this: https://github.com/pytorch/pytorch/blob/master/torch/nn/quantized/modules/linear.py#L60

what version of pytorch are you using? if you update to nightly the problem should be gone since we changed the serialization format for linear: https://github.com/pytorch/pytorch/blob/master/torch/nn/quantized/modules/linear.py#L220

G4V · July 14, 2020, 5:47pm

Many thanks for getting back.

I was on 1.5.1 but just pulled 1.7.0.dev20200705+cpu but alas, still no joy.

Anything I can do to help debug this?

G4V · July 19, 2020, 2:16pm

@jerryzh168, any ideas on next steps? Not sure if it’s something I’m doing incorrectly or a general problem with exporting.

Many thanks.

jerryzh168 · July 22, 2020, 9:27pm

are you getting the same error message after updating to nightly?

G4V · July 25, 2020, 7:34am

Hi @jerryzh168, yes. Updated initially to 1.7.0.dev20200705+cpu and just tried torch-1.7.0.dev20200724+cpu. No luck with either.

As I hijacked an old thread, I thought best to raise a separate issue with a simple example (single fully connected layer) to replicate -

I’ve had one reply with comment explaining that exporting of quantized models is not yet supported and a link to another thread. Sounds like it’s WIP. Would be good to get your take on the example in the other thread.

Many thanks again.

jerryzh168 · July 30, 2020, 3:53pm

cc @supriyar is quantized Linear supported in ONNX?

jerryzh168 · July 30, 2020, 3:55pm

what is the error message? i think linear is supported according to https://github.com/pytorch/pytorch/blob/master/torch/onnx/symbolic_caffe2.py

supriyar · July 30, 2020, 6:27pm

How are you exporting the quantized model to ONNX? Like previously mentioned we only currently support a custom conversion flow through ONNX to Caffe2 for quantized models. The models aren’t represented in native ONNX format, but a format specific to Caffe2.
If you wish to export model to caffe2, you can follow the steps here to do so (model needs to be traced first and need to set operator_export_type to ONNX_ATEN_FALLBACK)

github.com

pytorch/pytorch/blob/master/test/onnx/test_pytorch_onnx_caffe2_quantized.py#L17-L35


torch.backends.quantized.engine = "qnnpack"
pt_inputs = tuple(torch.from_numpy(x) for x in sample_inputs)
model.qconfig = torch.quantization.get_default_qconfig('qnnpack')
q_model = torch.quantization.prepare(model, inplace=False)
q_model = torch.quantization.convert(q_model, inplace=False)

traced_model = torch.jit.trace(q_model, pt_inputs)
buf = io.BytesIO()
torch.jit.save(traced_model, buf)
buf.seek(0)
q_model = torch.jit.load(buf)

q_model.eval()
output = q_model(*pt_inputs)

f = io.BytesIO()
torch.onnx.export(q_model, pt_inputs, f, input_names=input_names, example_outputs=output,
                  operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK)
f.seek(0)

neginraoof · June 14, 2021, 4:32pm

Export of pytorch QAT models to ONNX standard is supported now. You should be able to export the model without operator_export_type = ONNX_ATEN_FALLBACK

addisonklinke · June 17, 2021, 2:23pm

@neginraoof Can you post a minimal example of PyTorch QAT → ONNX and clarify which PyTorch version this became supported in? I’m on 1.8.1 and still seeing similar errors as @G4V. In all the discussion I’m able to find (forum + Github issues), @supriyar says the only support path is still involving Caffe2