Implement Recurrence Over a Layer

Hi

Going to this article (http://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html) it looks like implementing a recurrence over a layer is just repeating the layer in the forward function while defining the network.

My question is:

  1. Can I implement a Simple RNN this way? I couldn’t find any resources over this on the internet. For example, if I want to try a variant of LSTM or GRU could I define a class that has the new cell’s implementation and then just repeat it over time steps (i.e. sequence values in text).

  2. Can I define recurrence over an arbitrary layer. Say, I want recurrence over a Linear layer. Could I just take the layer and implement recurrence over it by just looping through the layer outputs and applying the same layer again?

Thanks for the awesome library!

1& 2: yes you can :slight_smile:

1 Like

gr8, thanks! any links or implementations implementing RNNs this way or having recurrence over an arbitrary layer?

Perhaps the example here http://pytorch.org/docs/master/nn.html?highlight=cell#torch.nn.RNNCell ?

You can also write a custom Module that wraps the for loop in forward as well.

thanks! but i don’t get how 1 or 2 are implemented there…the link is about different rnn cells.

Custom module to wrap the for loops in forward… i will try

I meant the examples in the doc for using RNNCells. They are essentially repeatedly applying a same module.

1 Like

yup got it now, thanks…

rnn = nn.RNNCell(10, 20)
input = Variable(torch.randn(6, 3, 10))
hx = Variable(torch.randn(3, 20))
output =
for i in range(6):
… hx = rnn(input[i], hx) # this is recurrence
… output.append(hx)

1 Like

hi, plz take at look Implementing Recurrence on an Arbitrary Layer . thanks.

I assume that using a for loop would really slow things down. Is there any wrapper that would do something similar?

I don’t think there is another way to achieve the same.
You have to execute this sequentially since each iteration requires the result of the previous one.
The overhead of the loop should be neglectible, since GPUs work asynchronous and the loop only puts the kernel calls in a queue.

1 Like