I want to use my PyTorch pre-trained vgg16 model on a rasberry pi. I am converting it to onnx and then TFLITE. Inference on onnx and tflite work well on my computer but on the pi with the same pre-processing the output are scaled differently. Do i need to quantize the model before converting to onnx format to tflite. Any hints would be helpful. Thanks
my understanding is that onnx export to caffe2 is the only thing supported for pytorch quantized models and even then its spotty. you may be better off asking in a specific onnx or tflite group since it doesn’t sound like you’re using the pytorch quantization apis currently and your existing process sounds like it mostly works.
I am facing an issue with the preprocessing step as far as i have tried to find the issue. Upon conversion of my model to say onnx runtime my values are scaled differently at inference, if i use the same preprocessing steps without pytorch transforms. Whereas, when i use pytorch tranforms. Compose() to make the totensor and normaliztion of my input data then my outputs on the onnx works like a charm. Its is the same for my tflite model too. Do you know why this could be the case? Any hints. Thanks
No idea, sorry, pytorch quantization primarily refers to the APIs described here: Quantization — PyTorch 1.11.0 documentation
There’s some work to convert models quantized with the above APIs to onnx but that doesn’t sound like your issue, it sounds like you have a onnx model and converted it to TFLITE and something’s gone wrong somewhere during that process. Its possible someone with experience in those steps but it doesn’t sound like you need help with pytorch quantization so it may be faster to ask in an onnx or tflite group.