How to correctly give inputs to Embedding, LSTM and Linear Layers?

  1. I think that if you give an nn.Embedding input of shape (seq_len, batch_size), then it will happily produce output of shape (seq_len, batch_size, embedding_size). Embedding expects 2d input and replaces every element with a vector. Thus the order of the dimensions of the input has no importance.

  2. Your LSTM input and output sizes look mostly good to me. This post helped me get my head around them. Understanding output of lstm

You can initialise nn.LSTM with batch_first=True if you need to switch the seq_len and batch_size dimensions of the input and output.

If the input to nn.Embedding is appropriately shaped then I can’t see why a .view operation before the LSTM should be necessary.

  1. For consuming last hidden state only…
lstm_output, (last_hidden_state, last_cell_state) = self.lstm(embedded)
linear_input = last_hidden_state[-1] # get hidden state for last layer
# or equivalently
linear_input = lstm_output[-1] # get last step of output

For consuming the hidden states of the whole sequence

lstm_output, (last_hidden_state, last_cell_state) = self.lstm(embedded)
batch_first = lstm_output.transpose(0,1)
linear_input = batch_first.view(batch_size, -1)

Note that in this case the sequence length must always be the same.

Most tensor ops work on Variables too, which is necessary if you want to backpropagate.
If you operate on tensors directly then those operations are not stored in the computation graph and cannot be backpropagated.

4 Likes