LSTM.weight_ih_l[k] dimensions with proj_size

mrityu · September 15, 2021, 9:27am

According to Pytorch LSTM documentation :-

~LSTM.weight_ih_l[k] – the learnable input-hidden weights of the kth\text{k}^{th}kth layer (W_ii|W_if|W_ig|W_io), of shape (4*hidden_size, input_size) for k = 0. Otherwise, the shape is (4 * hidden_size, num_directions * hidden_size)

My doubt is, why for k > 0 the shape for each weight is (hidden_size, num_directions * hidden_size), according to me, shouldn’t be (hidden_size, num_directions * proj_size) because the layer above the lowest layer is receiving the input which is the output of the lowest layer which have the shape of (L, N, num_directions*proj_size)

mrityu · September 16, 2021, 2:10am

The dimensions in the doc will be updated very soon.

Github Issue