Hi,
I’m fairly new to PyTorch and I’d like to understand how to import a quantized TFLite model into PyTorch so I can work on it in PyTorch.
I already have a PyTorch model definition which matches the model used to create the .tfilte
file – except for the fact that this tflite file has been quantized, presumably automatically at export time.
There are two aspects of this I want to understand better.
First, the conv2d kernels and biases in the TFLite file are float16. Of course, I load these tensors’ buffers into float16 numpy arrays when I am reading from the tflite file. But is it enough to use these float16 numpy arrays as the values when I am populating the state_dict for my PyTorch model? Or do I need to define the torch model differently, for instance when I initialize the nn.Conv2d
modules?
Second, I notice that this TFLite model has many (about 75) automatically generated “dequantize” layers after the normal-seeming part of the model. Do I need to manually add layers to my PyTorch model to match all these TFLite dequantization layers?
I’d appreciate any advice, especially pointers to any examples of how to do this.