How to quantize a model with both CNN and LSTM

Nanton · August 16, 2022, 4:49pm

Hi there,

If there is a model with CNN as backbone, LSTM as its head, how to quantize this whole model with post training quantization? It seems we can apply static quantization to CNN and dynamic quantization to LSTM( Quantization — PyTorch 1.12 documentation). But not very sure how to deal with cases like above one.

Thanks in advance!

Vasiliy_Kuznetsov · August 18, 2022, 3:36pm

Hi @Nanton ,

What you said is correct, we have official support for static quantization for CNNs and dynamic quantization for LSTMs.

There is an unreleased prototype of static quantization of LSTMs, an example of how to use it is here: pytorch/test_quantized_op.py at a9ba3fe1dbf2cea45c9a7e723010c27c211f7fe3 · pytorch/pytorch · GitHub . There is no documentation or tutorial on this feature yet, but we hope to get to it in the future.

Nanton · August 18, 2022, 6:20pm

Hi @Vasiliy_Kuznetsov ,

Appreciate your reply. I look forward to this feature as well.

Based on what we have now, I was wondering how to quantize a model with both CNN and LSTM . Is there any tutorial available?

Thanks

jerryzh168 · August 25, 2022, 11:17pm

we don’t have a tutorial yet. Are you using eager mode quantization? if so LSTM is supported by default, you can follow the original flow: (beta) Static Quantization with Eager Mode in PyTorch — PyTorch Tutorials 2.1.1+cu121 documentation

fx graph mode quantization is not fully supporting static quantization for LSTM yet I think

Komail_Mehrgan · February 9, 2023, 11:41am

Simply don’t quantize LSTM part . try search quantize specific layer in pytorch.

farough · August 20, 2023, 5:02pm

How can I quantize the whole model if I have ResNet Blocks followed by LSTM layer? When I did not quantize the LSTM, the accuracy of the model was halved, and when I only quantized the LSTM by PTDQ, the acceleration of the model was negligible.

HDCharles · August 22, 2023, 3:48pm

generally quantizing more modules isn’t going to improve the accuracy, if NOT quantizing the LSTMs is less accurate something is going very wrong. I would take a deeper look at it because it should be close to impossible for that to happen.