I am implementing the LSTM since I want to modify its weights during the training. The naive LSTM works pretty well, but when I come to the bidirectional LSTM, its weights confuse me.
As the docs introduced, for the bidirectional version, its weight_ih_l[k] already has the shape
(4*hidden_size, num_directions * hidden_size) which I suppose it contains the weights for the forward direction and backward direction since the
num_directions * hidden_size. But it also has weight_ih_l[k]_reverse. I am wondering if this weight is redundant.
Since the official LSTM is fully written in C++, it is not easy for me to understand the source code (or find it ).