Encounter a confusing RNN architecture

  1. import torch.nn as nn

  2. from torch.autograd import Variable

  3. class RNN(nn.Module):

  4. def __init__(self, input_size, hidden_size, output_size):
  5.     super(RNN, self).__init__()
  6.     self.hidden_size = hidden_size
  7.     self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
  8.     self.i2o = nn.Linear(input_size + hidden_size, output_size)
  9.     self.softmax = nn.LogSoftmax(dim = 1)
  10. def forward(self, input, hidden):
  11.     combined = torch.cat((input, hidden), 1)
  12.     hidden = self.i2h(combined)
  13.     output = self.i2o(combined)
  14.     output = self.softmax(output)
  15.     return output, hidden
  16. def initHidden(self):
  17.     return Variable(torch.zeros(1, self.hidden_size))

I see it in a book named (Vishnu).
Are there any errors in Line 13th? I mean that ‘combined’ should be replaced by ‘hidden’?
Correspondingly, does the Line 8th should be changed to ’self.h2o = nn.Linear(hidden_size, output_size)‘?
My another reference book gives the similar codes with , so I was very confused.

The code above assumes that the input to self.i2h takes in both the input and the output from the last hidden layer. Similarly, self.i2o takes in both the input and the output from the last hidden layer.

While the code above will still work. I prefer what you mentioned below. In theory, i2o should not be dependent on the input, instead, it should depend on just the output of the current hidden.