How to customize a quantization algorithm and deploy it?

BambooKui · February 4, 2025, 9:45am

Excuse me, how can I implement some quantization algorithms in PyTorch? Should I use torch.fx or torch.ao? Are there any relevant tutorials? And how can I deploy the model quantized with this quantization algorithm to hardware? I have tried some quantization tools, but there often exist problems of operator incompatibility.

anantguptadbl · February 5, 2025, 7:49am

@BambooKui there are several articles that touch upon the use-cases for torch.fx

https://pytorch.org/tutorials/prototype/fx_graph_mode_quant_guide.html
https://pytorch.org/docs/stable/quantization.html

For quantization using torch.fx
https://pytorch.org/tutorials/prototype/fx_graph_mode_quant_guide.html

For quantization using torch.ao
https://pytorch.org/docs/stable/quantization.html

If you do not have a requirement to trace your DAG flow, i would recommend going ahead with torch.ao since it is present in the stable version

BambooKui · February 5, 2025, 8:28am

Thank you very much!