Converting quantized models from PyTorch to ONNX

I am trying to export a quantized int8 PyTorch model to ONNX from the following tutorial.

https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html

However, PyTorch to ONNX conversion of quantized models is not supported. Various types of quantized models will either explicitly say their conversion is not supported or they will throw an attribute error.

My question is — how do we do the conversion manually? Specifically, how do we define a custom mapping of ONNX operations for PyTorch classes? I assume the logic is the same for non-quantized layers, whose conversion needed to be defined until it was built-in, but I am having trouble finding an example.

cc @supriyar might be able to help.

We currently only support conversion to ONNX for Caffe2 backend. This thread has additional context on what we currently support - ONNX export of quantized model

If you would like to add custom conversion logic to onnx operators for quantized pytorch ops you can follow the code in https://github.com/pytorch/pytorch/blob/master/torch/onnx/symbolic_caffe2.py which adds the mapping for the Caffe2 ops in ONNX.

ï¼  Joseph_Konan Hello, can you now convert the quantified model to ONNX, thank you!

1 Like

Thanks for the update — I’ll look into this!