Error in the PyTorch GRU Doc Page

abedshantti · June 23, 2021, 10:26am

I have noticed that there might be an error in the docs for the GRU layer.

https://pytorch.org/docs/stable/generated/torch.nn.GRU.html

Shouldn’t the hidden layer in the 4th line of the definition be presented as:

h(t) = ( (1 - z(t) ) * h(t-1) ) + ( z(t) * n(t) )

instead of: h(t) = ( (1 - z(t) ) * n(t) ) + ( z(t) * h(t-1) )

as it is currently written in the docs?

Here is the architecture of the GRU layer:

gist.github.com

https://gist.github.com/sey-kh/34cd13a4139b30776ff697b431c3c370

gru_recurrent_network.md

# GRU recurrent neural network  #

GRU (Gated Recurrent Unit) aims to solve the `vanishing gradient problem` (The problem is that in some cases, the gradient will be small, effectively preventing the weight from changing its value then the network stop learning) which comes with a **standard recurrent neural network**.

## Standard recurrent neural network ##
![reccurrent-network-arch (1)](https://user-images.githubusercontent.com/49018140/59248727-c1309f80-8c12-11e9-82ea-fb6b6c70ed53.png)

**RNN network** can predict output base on previous output or predict output by taking external input + previous output. It kind of ilterate process by taking previous output to generate new output in terms of processing sequence data.

As above diagram mentioned, there are external input (I0, I1, I2) and sequence output (O0, O1, O2)

This file has been truncated. show original

ptrblck · June 24, 2021, 3:19am

Thanks for raising this issue. Would you mind creating a GitHub issue so that the code owners can take a look there, too?