What workflow for converting from Pytorch to tflite model with QUANTIZATION?

_lovepytorch · May 8, 2024, 10:28am

My board supported tflite model (it is mandatory) for classification task. If I develops with Tensorflow, so deployment tfite on edge devices is straightforward: Tensorflow model (.h5) ==> Quantize and save to tflite mode. Tensorflow supprort many quantization techniques and it is easy to use.

But now, community widely use Pytorch as Framework for Deep Learning, and I want to develop with Pytorch to get better float32 model. I have read in forums that workflow of deployment tfite on edge devices is as follow
.pt model ==> .onnx model => .tflite model, I call it as flow 2.
I have a question. With Pytorch flow, where can I apply quantization (PTQ or QAT) in flow 2 to get better performance of tflite model?
Thank you so much.

_lovepytorch · May 9, 2024, 7:45am

@ptrblck Please help me. Thank you.

ptrblck · May 9, 2024, 3:32pm

I’m not familiar enough with your quantization workflow, but moved your topic to the quantization category.