I would like to understand the inputs of the GRUCell but unfortunately, the documentation is very unprecise in this:

**Inputs:** The inputs are declared as (N, H_in) where N is undefined.

**Hidden:** The hidden are declared as (N,H_out) where N is undefined.

The exmaple takes a three dimensional input

```
rnn = nn.GRUCell(10, 20)
input = torch.randn(6, 3, 10)
hx = torch.randn(3, 20)
output = []
for i in range(6):
hx = rnn(input[i], hx)
output.append(hx)
```

If we loop over the first dimension of the input, I assume, it must be the time, making the second dimension the batch dimension, which defines N.

However, why is the first dimension of the hidden parameter **hx** a batch dimension?

Or, if the second is the time and the first the batch dimension, why do you loop over the batch dimension here?