How to use LSTMCell with LayerNorm?

Vannila · June 12, 2019, 1:58pm

I want to use LayerNorm with LSTM, but I’m not sure what is the best way to use them together.
My code is as follows:

rnn = nn.LSTMCell(in_channels, hidden_dim)

hidden, cell = rnn(x, (hidden, cell))

So, if I want to add LayerNorm to this model, I will do it like this?

rnn = nn.LSTMCell(in_channels, hidden_dim)
norm = nn.LayerNorm(hidden_dim)

hidden, cell = rnn(x, (hidden, cell))
hidden = norm(hidden)
cell = norm(cell)

Is it the right way to use it? I’m confused. Sorry for my noob question.

richard · June 12, 2019, 2:47pm

https://pytorch.org/blog/optimizing-cuda-rnn-with-torchscript/ is a good tutorial. In particular, you can copy & paste from this file: https://github.com/pytorch/pytorch/blob/cbcb2b5ad767622cf5ec04263018609bde3c974a/benchmarks/fastrnns/custom_lstms.py#L60-L83 that has a layernorm lstm implementation.

angerhang · November 9, 2021, 3:02pm

Unfortunately, the referred script doesn’t support padded sequence. Any ideas when will PyTorch support this formally?