Convert quantized model to ONNX format

Ahmed_Louati · June 2, 2023, 9:41am

Hello,

I am working on quantizing a model using FX GraphModule mode.

I wanna ask about the best methods to export it to ONNX format (if it is supported).
Do I have to torchscript it ( torch.jit.trace OR torch.jit.export )? Or I just export it directly using torch.onnx.export API.
Are (dynamically) quantized LSTM/GRU layers/cells exportable to ONNX? (I saw that ONNX supports LSTM layers but not GRU)

Looking forward to hearing from you!
Peace,

andrewor · June 5, 2023, 3:23pm

Hi Ahmed,

As far as I know exporting to ONNX is not officially supported, and we are not actively working on this integration. However, here’s a thread that you may find useful: ONNX export of quantized model - #32 by tsaiHY. I would guess more complex ops like LSTM/GRUs in particular are not well supported.

Best,
-Andrew

Ahmed_Louati · June 6, 2023, 9:42am

Thanks a lot @andrewor for your reply!

From what I have understood from the thread you sent me, QAT models are not exportable to ONNX.
For the moment I am working with Post Training Quantization, it is still unsupported also?

Thank you!

jerryzh168 · June 23, 2023, 4:27am

ONNX path is in general supported by our team, please see the response here: Quantization — PyTorch 2.0 documentation