How to quantize a model with both CNN and LSTM

Hi there,

If there is a model with CNN as backbone, LSTM as its head, how to quantize this whole model with post training quantization? It seems we can apply static quantization to CNN and dynamic quantization to LSTM( Quantization — PyTorch 1.12 documentation). But not very sure how to deal with cases like above one.

Thanks in advance!

1 Like

Hi @Nanton ,

What you said is correct, we have official support for static quantization for CNNs and dynamic quantization for LSTMs.

There is an unreleased prototype of static quantization of LSTMs, an example of how to use it is here: pytorch/test_quantized_op.py at a9ba3fe1dbf2cea45c9a7e723010c27c211f7fe3 · pytorch/pytorch · GitHub . There is no documentation or tutorial on this feature yet, but we hope to get to it in the future.

1 Like

Hi @Vasiliy_Kuznetsov ,

Appreciate your reply. I look forward to this feature as well.

Based on what we have now, I was wondering how to quantize a model with both CNN and LSTM . Is there any tutorial available?

Thanks

we don’t have a tutorial yet. Are you using eager mode quantization? if so LSTM is supported by default, you can follow the original flow: (beta) Static Quantization with Eager Mode in PyTorch — PyTorch Tutorials 2.1.1+cu121 documentation

fx graph mode quantization is not fully supporting static quantization for LSTM yet I think

Simply don’t quantize LSTM part . try search quantize specific layer in pytorch.

How can I quantize the whole model if I have ResNet Blocks followed by LSTM layer? When I did not quantize the LSTM, the accuracy of the model was halved, and when I only quantized the LSTM by PTDQ, the acceleration of the model was negligible.

generally quantizing more modules isn’t going to improve the accuracy, if NOT quantizing the LSTMs is less accurate something is going very wrong. I would take a deeper look at it because it should be close to impossible for that to happen.