Pytorch LSTM Dimension Issue

I’m pretty sure, that’s your issue

Before that, lstm_out has shape of (batch_size, seq_len, num_directions*hidden_dim). After the .view() it’s (batch_size*seq_len*num_directions, hidden_dim)note that might also be wrong. With batch_size=50, seq_len=200 and num_directions=1 the shape is as expected: (10000, hidden_dim).

This means you’ve created a tensor that is interpreted as having 10,000 samples. Given your classification task, here are two suggestions:

  • Don’t use lstm_out but lstm_hidden. lstm_out contains the hidden state at each time step; lstm_hidden only the last hidden state

  • If you use lstm_out you may want to sum/avg the hidden states at each time step

1 Like