Weight data structure for RNN

zhou9 · October 11, 2017, 7:10pm

In file “pytorch/torch/nn/modules/rnn.py” line 37 have layer_input_size = hidden_size * num_directions.
At the same time in file “pytorch/torch/backends/cudnn/rnn.py” line 121 have num_layers = fn.num_directions * fn.num_layers.

I’m a little bit confused when in bidirectional case, do we want to double the input size or the layer size? What is the layout of the data looks like? Both seems make sense but conflict with each other.