RNN: output vs hidden state don't match up (my misunderstanding?)

@dhruvbird Motivated by your post I’ve actually checked and replied to my old post. Maybe you can have a look to see if this makes sense.

The important bit is that when bidirectional=True, you get both hidden states for a sequence item in concatenated form: first the one for the forward direction, then the one for the backward direction. This then also implies that the respective last hidden states are on opposite ends. Again, I give a concrete example in my reply linked above.

It’s not really complicated as it is consistent and intuitive. But, yes, the bidirectional case requires a bit more care to “extract” the correct bits of h_n to be used for further layers. This also depends what you’re actually trying yo train.