Understanding the GRUCell and its inputs

Max_Unhold · March 13, 2024, 5:15pm

I would like to understand the inputs of the GRUCell but unfortunately, the documentation is very unprecise in this:

Inputs: The inputs are declared as (N, H_in) where N is undefined.
Hidden: The hidden are declared as (N,H_out) where N is undefined.

The exmaple takes a three dimensional input

rnn = nn.GRUCell(10, 20)
input = torch.randn(6, 3, 10)
hx = torch.randn(3, 20)
output = []
for i in range(6):
    hx = rnn(input[i], hx)
    output.append(hx)

If we loop over the first dimension of the input, I assume, it must be the time, making the second dimension the batch dimension, which defines N.

However, why is the first dimension of the hidden parameter hx a batch dimension?

Or, if the second is the time and the first the batch dimension, why do you loop over the batch dimension here?