I am looking at the example at the bottom of this page:
https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html
rnn = nn.LSTM(10, 20, 2)
input = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
c0 = torch.randn(2, 3, 20)
output, (hn, cn) = rnn(input, (h0, c0))
If I understand correctly:
- the LSTM has an input dimension of 10, a hidden dimension of 20 and 2 layers;
- the input is a batch of size 5, each item being a sequence of three elements of size 10 (i.e. the input dimension of the LSMT);
Am I correct?
If I am, why do I need three hidden initial tensors (h0)?