Speeding up RNN loop

rajcscw · June 6, 2018, 9:06pm

Hi I’m new to PyTorch, I have a simple RNN, during forward() pass, at each step, I compute the loss of the function (in a loop) which is very slow. Is there a way to speed things up.
I read that there is a way to avoid such loops using packed sequences. But I do not understand.

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, n_layers=1):
        super(RNN, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.n_layers = n_layers

        self.encoder = nn.Embedding(input_size, hidden_size)
        self.rnn = nn.RNN(hidden_size, hidden_size, n_layers)
        self.decoder = nn.Linear(hidden_size, output_size)

    def forward(self, input, hidden):
        input = self.encoder(input.view(1, -1))
        output, hidden = self.rnn(input.view(1, 1, -1), hidden)
        output = self.decoder(output.view(1, -1))
        return output, hidden

    def init_hidden(self):
        return Variable(torch.zeros(self.n_layers, 1, self.hidden_size))

And the loss function as

def loss(inputs, targets)

        # reset the hidden layer
        hidden = net.init_hidden()

        loss = 0
        chunk_len = inputs.shape[0]
        for c in range(chunk_len):
            output, hidden = net.forward(inputs[c], hidden)
            output_ = output.view((1,-1))
            target_ = targets[c].reshape((-1,1)).squeeze(1)
            loss += criterion(output_, target_)

        # mle loss
        mle = loss.data / chunk_len

        return mle

SimonW · June 6, 2018, 9:13pm

You are using nn.RNN, which supports taking in the entire sequence. So you don’t need to feed into it token-by-token.

rajcscw · June 7, 2018, 8:11am

Thanks, but what about the hidden state? How do we pass since we are passing the whole sequence?

SimonW · June 8, 2018, 7:47pm

it’s passed internally. you can still supply the starting hidden state if you want to