Implement Recurrence Over a Layer

Andy_Markman · January 10, 2018, 7:27pm

Hi

Going to this article (http://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html) it looks like implementing a recurrence over a layer is just repeating the layer in the forward function while defining the network.

My question is:

Can I implement a Simple RNN this way? I couldn’t find any resources over this on the internet. For example, if I want to try a variant of LSTM or GRU could I define a class that has the new cell’s implementation and then just repeat it over time steps (i.e. sequence values in text).
Can I define recurrence over an arbitrary layer. Say, I want recurrence over a Linear layer. Could I just take the layer and implement recurrence over it by just looping through the layer outputs and applying the same layer again?

Thanks for the awesome library!

SimonW · January 10, 2018, 8:39pm

1& 2: yes you can

Andy_Markman · January 10, 2018, 8:47pm

gr8, thanks! any links or implementations implementing RNNs this way or having recurrence over an arbitrary layer?

SimonW · January 10, 2018, 8:57pm

Perhaps the example here http://pytorch.org/docs/master/nn.html?highlight=cell#torch.nn.RNNCell ?

You can also write a custom Module that wraps the for loop in forward as well.

Andy_Markman · January 10, 2018, 9:22pm

thanks! but i don’t get how 1 or 2 are implemented there…the link is about different rnn cells.

Custom module to wrap the for loops in forward… i will try

SimonW · January 10, 2018, 9:24pm

I meant the examples in the doc for using RNNCells. They are essentially repeatedly applying a same module.

Andy_Markman · January 10, 2018, 9:30pm

yup got it now, thanks…

rnn = nn.RNNCell(10, 20)
input = Variable(torch.randn(6, 3, 10))
hx = Variable(torch.randn(3, 20))
output =
for i in range(6):
… hx = rnn(input[i], hx) # this is recurrence
… output.append(hx)

Andy_Markman · January 13, 2018, 2:23pm

hi, plz take at look Implementing Recurrence on an Arbitrary Layer . thanks.

Dan_Erez · April 5, 2019, 2:17pm

I assume that using a for loop would really slow things down. Is there any wrapper that would do something similar?

justusschock · April 5, 2019, 2:34pm

I don’t think there is another way to achieve the same.
You have to execute this sequentially since each iteration requires the result of the previous one.
The overhead of the loop should be neglectible, since GPUs work asynchronous and the loop only puts the kernel calls in a queue.