Hello!
I am implementing the LSTM since I want to modify its weights during the training. The naive LSTM works pretty well, but when I come to the bidirectional LSTM, its weights confuse me.
As the docs introduced, for the bidirectional version, its weight_ih_l[k] already has the shape (4*hidden_size, num_directions * hidden_size)
which I suppose it contains the weights for the forward direction and backward direction since the num_directions * hidden_size
. But it also has weight_ih_l[k]_reverse. I am wondering if this weight is redundant.
Since the official LSTM is fully written in C++, it is not easy for me to understand the source code (or find it ).
Thanks!