Stacked RNN with different hidden size at each layer?

Can I build a multi-layer RNN with different hidden size per layer using PyTorch?

For example, a 3-layer RNN with feature size of 512, 256, 128 at each layer respectivey?


Yes, but you need to figure out the input and output of RNN/LSTM/GRU.

By ‘layer’ I mean the layers of a stacked RNN. PyTorch RNN module only takes a single parameter ‘hidden_size’ and all stacked layers are of exactly the same hidden size. But is it possible to make layers have different hidden size?

How about this:

import torch 
from torch import nn
from torch.autograd import Variable

# layer1
# input_dim=10, output_dim=20
rnn1 = nn.LSTM(10, 20, 1)
input = Variable(torch.randn(5, 3, 10))
output1, hn = rnn1(input)

# layer2
# input_dim=20 output_dim=30
rnn2 = nn.LSTM(20, 30, 1)
output2, hn2 = rnn2(output1) 

It works well. Thanks!

Is it okey to discard hidden weights hn from first stack?

It seems ok.

1 Like

Could you please elaborate on that: I do not understand why we have to discard hn in the second layer while scheme says, that h2_t depends on both h2_t-1 and h1_t? Is there a way to handle both states in pytorch?

h2_t-1 is included in the same layer and h1_t is included in the output of the previous layer. You don’t need hn to make your network work.

1 Like