Encounter a confusing RNN architecture

Gideon · January 17, 2021, 8:16am

import torch.nn as nn
from torch.autograd import Variable
class RNN(nn.Module):

def __init__(self, input_size, hidden_size, output_size):

```
    super(RNN, self).__init__()
```
```
    self.hidden_size = hidden_size
```

    self.i2h = nn.Linear(input_size + hidden_size, hidden_size)

    self.i2o = nn.Linear(input_size + hidden_size, output_size)

    self.softmax = nn.LogSoftmax(dim = 1)

```
def forward(self, input, hidden):
```

    combined = torch.cat((input, hidden), 1)

```
    hidden = self.i2h(combined)
```
```
    output = self.i2o(combined)
```
```
    output = self.softmax(output)
```
```
    return output, hidden
```
```
def initHidden(self):
```

    return Variable(torch.zeros(1, self.hidden_size))

I see it in a book named (Vishnu).
Are there any errors in Line 13th? I mean that ‘combined’ should be replaced by ‘hidden’?
Correspondingly, does the Line 8th should be changed to ’self.h2o = nn.Linear(hidden_size, output_size)‘?
My another reference book gives the similar codes with , so I was very confused.

Abhilash_Srivastava · January 17, 2021, 10:16am

The code above assumes that the input to self.i2h takes in both the input and the output from the last hidden layer. Similarly, self.i2o takes in both the input and the output from the last hidden layer.

While the code above will still work. I prefer what you mentioned below. In theory, i2o should not be dependent on the input, instead, it should depend on just the output of the current hidden.