Convert quantized model to ONNX format

Hello,

I am working on quantizing a model using FX GraphModule mode.

  1. I wanna ask about the best methods to export it to ONNX format (if it is supported).
    Do I have to torchscript it ( torch.jit.trace OR torch.jit.export )? Or I just export it directly using torch.onnx.export API.

  2. Are (dynamically) quantized LSTM/GRU layers/cells exportable to ONNX? (I saw that ONNX supports LSTM layers but not GRU)

Looking forward to hearing from you!
Peace,

Hi Ahmed,

As far as I know exporting to ONNX is not officially supported, and we are not actively working on this integration. However, here’s a thread that you may find useful: ONNX export of quantized model - #32 by tsaiHY. I would guess more complex ops like LSTM/GRUs in particular are not well supported.

Best,
-Andrew

Thanks a lot @andrewor for your reply!

From what I have understood from the thread you sent me, QAT models are not exportable to ONNX.
For the moment I am working with Post Training Quantization, it is still unsupported also?

Thank you!

ONNX path is in general supported by our team, please see the response here: Quantization — PyTorch 2.0 documentation