Hi,
I wanted to point out a possible typo on the website.
https://pytorch.org/docs/stable/generated/torch.nn.GRU.html
The hidden state update equation should be ht = (1-zt)h_t-1 + ztnt, not the other way round.
Intuitively, it makes sense that there is a typo, also the paper by the creators of GRU also use the form of the equation that I mention above (ref: [1412.3555] Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling)
Can you please confirm?
Regards,
Aman