Does documentation of output of torch.nn.LSTM correct?

wasiahmad · April 2, 2017, 6:29am

In the documentation, output of torch.nn.LSTM is described as follows.

output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.

Previously I thought output is a tensor containing the output features which is actually o_t, not h_t. Though, o_t is not an input to the LSTM, so the description looks reasonable.

But in few examples, I have seen, to get all the hidden state representations for each word in a sequence (language modeling task), we loop through all the words in that sequence. If output already gives us all the hidden state representations (for all t from the final layer), why we need to loop through all the words of a sequence?

I asked a similar question before - How to retrieve hidden states for all time steps in LSTM or BiLSTM? where I got the answer from @smth that, “To get individual hidden states, you have to indeed loop over for each individual timestep and collect the hidden states”. But now I feel individual hidden states for each timestep is already provided. (Please correct me if I am wrong)

Moreover this leads me to another concern that if we need o_t from the last layer of the RNN, how can we get them? Can anyone shed some light in this?

apaszke · April 3, 2017, 10:36pm

Because sometimes you might need the hidden states of lower layers too, and these are not returned form the module (remember that nn.LSTM support multi-layer LSTMs).

wasiahmad · April 4, 2017, 3:59am

I agree and I understood that point. Thanks a lot for clarifying my confusion.