I think the docs for LSTM / GRU need to be modified:
https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html
I have noticed that students consistently misunderstand the definition of input_size
- they think it means the length of the series. This confuses them, as they expect LSTM to work with variable length data.
Additionally, the “example” at the bottom of the page is not easy to follow.
I’ve seen over 100 people go through these docs at this point, and I would estimate less than 10% of them understand how to use the function after reading them.