RuntimeError: Expected hidden size (2, 1, 128), got [64, 1, 128]

No idea why you are doing this…but don’t :).

Before that, h0 has the correct shape: (num_layer, batch_size, hidden_size). After that line, it’s only (batch_size, hidden_size). The nn.RNN, since it sees only a 2d tensor, interprets this as (seq_len, hidden_size).

The problem is that nn.RNN wants a 3d tensor. It therefore does an unsequeeze(1) to add the batch dimension – again, the nn.RNN thinks it’s an unbatched input, so it makes it a batched one. You can also check this post that addresses the same issue.

1 Like