I read the quantization paper in pytorch website and realize the post dynamic quantization and static quantization.dynamic quantization is good for LSTM and Linear, and static quantization is good for CNNs, I wanna ask: when I use the CRNN model，the model is like: CNN + LSTM + Linear, what is the best way to quantize my model, or is there some tricks to mix the two quantization methods?
I’d appreciate if anybody can help me! Thanks in advance!
I think it’s possible, you may apply static quantization to the CNN part of the model and dynamic quantization on LSTM + Linear part of the model, since both of them will have float data in the input and output, the combined model should work.
1.fix rnn and linear layers, quantize cnn layers (post-training static quantization)
2.fix rnn and linear layers, quantize cnn layers (quantization-aware training, this step is optional)
3.fix quantized cnn layers, quantize rnn and linear layers(post-training dynamic quantization)
Hi, how to fix rnn and linear layers when quantize cnn layers?
Quantization is controlled by the qconfig, so when quantize cnn layers you can remove the qconfig of rnn layer, this way rnn layer will not be quantized.
do you a demo on how to quantize different layers at different time? thanks!
You can control the layers to quantize by specifying quant/dequant stubs around the layers. For more details you can refer to the tutorial in (beta) Static Quantization with Eager Mode in PyTorch — PyTorch Tutorials 1.9.0+cu102 documentation
I am trying to do the exact same thing. Did you figure out how to do the quantization in a CNN-LSTM hybrid model?