Parametrization of bidirectional LSTM

sigma_x · February 23, 2022, 1:24pm

From this sketch I’m trying to understand the parametrization of bLSTM. Specifically, exactly how the outputs of forward and backward outputs are combined before the non-linearity. I didn’t find it in the source code. The connections are in the blue/orange box.

I’d say it’s something like this: , with both W and b being parameters and bias for the corresponding direction, but I’m not too sure. Could someone verify it perhaps?

vdw · February 24, 2022, 6:09am

nn.LSTM with bidirectional=True does not on it’s own combine the results of the forward and backward pass, this is up to you to decide how you want to do it.

You can see this in the shape of the output of nn.LSTM. If you check the docs, both the output and the hidden state(s) all include D as the number of directions (1 or 2) in the shape.