Weights for Bidirectional LSTM


I am implementing the LSTM since I want to modify its weights during the training. The naive LSTM works pretty well, but when I come to the bidirectional LSTM, its weights confuse me.

As the docs introduced, for the bidirectional version, its weight_ih_l[k] already has the shape (4*hidden_size, num_directions * hidden_size) which I suppose it contains the weights for the forward direction and backward direction since the num_directions * hidden_size. But it also has weight_ih_l[k]_reverse. I am wondering if this weight is redundant.

Since the official LSTM is fully written in C++, it is not easy for me to understand the source code (or find it :slight_smile: ).