Char Rnn tutorial few doubts


I have few doubts about the equations of the Rnn used on the on the following tutorial.

The equations state:
hidden = self.i2h(combined)
output = self.i2o(combined)
But the equations of RNN are:

shouldn’t they be:
hidden = self.i2h(combined)
output = self.i2o(hidden)

This means self.i2o = nn.Linear(hidden_size, output_size) right? Also, there is no non-linear activation after self.i2h? Could someone explain to me whether both variants are equally valid?

(Kuntoro Adi) #2

The tutorial is correct. In recurrent neural network, at state t, we pass the input at state t and the hidden state at state t-1. Therefore, the parameter for self.i2o is “combined”, not “hidden”.


For the hidden layer we pass ht-1 and xt combined. For the output only ht. Check the equations again!