From this sketch I’m trying to understand the parametrization of bLSTM. Specifically, exactly how the outputs of forward and backward outputs are combined before the non-linearity. I didn’t find it in the source code. The connections are in the blue/orange box.
I’d say it’s something like this: , with both W and b being parameters and bias for the corresponding direction, but I’m not too sure. Could someone verify it perhaps?