Building Reccurant Neural network with feed forward network in pytorch

i was going through This tutorial. There i have a doubt with the following class code.

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax()
    def forward(self, input, hidden):
        combined =, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def init_hidden(self):
        return Variable(torch.zeros(1, self.hidden_size))

This code was taken from Here. There it was mentioned that

Since the state of the network is held in the graph and not in the
layers, you can simply create an nn.Linear and reuse it over and over
again for the recurrence.

What i don’t understand is, how can one just increase input feature size in nn.Linear and say it is a RNN. Am i missing something here.


I guess this module should be used in a for loop depending on what you need.
For example, to just get the output for each input:

def forward_inputs(inputs):
    hidden = mod.init_hidden()
    outputs = []    
    for inp in inputs:
        out, hidden = mod(inp, hidden)
    return outputs
1 Like

Hi, thanks for the answer, i was wondering, is this very common to use this kind of setup for RNN?

I don’t use much RNNs.
But that depends a lot on what you want to do.
If you’re looking for standard stuff, there are modules that exist that do many to many mapping or many to one mapping and stuff. You can find them here in the doc.
This implementation is useful if you want to do something less common and you need to control how each step of the RNN is done. Possibly changing the hidden between iterations and filtering which outputs you want to return.

1 Like

i just looked up and saw that combines multiple tensor’s. so, in the code,
combined =, hidden), 1)
this means that the combined is input and hidden layer’s combined right. so,
self.i2h is from input to hidden layer and self.i2o is the output of combined to the next layer? so, in a way, he made a single tensor for the all the time steps. Is my understanding correct?

Well this is how an rnn works: you combine the current input with the previous step’s hidden state to get on one hand the output of this step and the new hidden state.
Here to do so, the input and the previous layer hidden state are combined together into a single tensor. And then a linear layer is used to get the output and another to get the new state.

1 Like

Stack overflow answer for the same question.