I am looking at the example at the bottom of this page:

https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html

```
rnn = nn.LSTM(10, 20, 2)
input = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
c0 = torch.randn(2, 3, 20)
output, (hn, cn) = rnn(input, (h0, c0))
```

If I understand correctly:

- the LSTM has an input dimension of 10, a hidden dimension of 20 and 2 layers;
- the input is a batch of size 5, each item being a sequence of three elements of size 10 (i.e. the input dimension of the LSMT);

Am I correct?

If I am, why do I need three hidden initial tensors (h0)?