What workflow for converting from Pytorch to tflite model with QUANTIZATION?

My board supported tflite model (it is mandatory) for classification task. If I develops with Tensorflow, so deployment tfite on edge devices is straightforward: Tensorflow model (.h5) ==> Quantize and save to tflite mode. Tensorflow supprort many quantization techniques and it is easy to use.

But now, community widely use Pytorch as Framework for Deep Learning, and I want to develop with Pytorch to get better float32 model. I have read in forums that workflow of deployment tfite on edge devices is as follow
.pt model ==> .onnx model => .tflite model, I call it as flow 2.
I have a question. With Pytorch flow, where can I apply quantization (PTQ or QAT) in flow 2 to get better performance of tflite model?
Thank you so much.

1 Like

@ptrblck Please help me. Thank you.

Iā€™m not familiar enough with your quantization workflow, but moved your topic to the quantization category.

1 Like