Understanding GRU/LSTM layers parameter

Hi,

I think I understand but I wanted to check how the layers parameter works in LSTM and GRU. Am I correct in thinking that the layers parameter is effectively a short-hand repetition i.e.

for a univariate time series, and GRUs with a hidden state size 10

self.gru = nn.GRU(input_size=1,hidden_size=10,num_layers=2)

Is equivalent to:

self.gru1 = nn.GRU(input_size=1,hidden_size=10)
self.gru2 = nn.GRU(input_size=10,hidden_size=10)

I think that is what the documentation is saying.

Thanks,

Tim

For clarity the hidden size of 10 was picked at random, I think LSTM is similar but I only wrote up GRU as the example.

I’ll even take a yes or no answer so I know I should keep looking :slight_smile:

If anyone comes looking at this - someone just asked a similar question and got an answer they are the same:

@timdnewman Run the following code on a jupyter to see what nn.GRU() does :
You’ll see that the prints have the same values.

import torch
import torch.nn as nn

gru = nn.GRU(input_size=1,hidden_size=1,num_layers=2, bias=False)
gru_0 = nn.GRU(input_size=1,hidden_size=1,num_layers=1, bias=False)
gru_1 = nn.GRU(input_size=1,hidden_size=1,num_layers=1, bias=False)

#save the parameters of the gru. 
params = list(gru.named_parameters())

# Assign the weights to the other grus.
gru_0.weight_ih_l0 = params[0][1]
gru_0.weight_hh_l0 = params[1][1]
gru_1.weight_ih_l0 = params[2][1]
gru_1.weight_hh_l0 = params[3][1]


input_tensor = torch.tensor([[3.]])
_, hs = gru(input_tensor)
print(hs)

out0, h0 = gru_0(input_tensor)
out1, h1 = gru_1(out0)
print(out0, out1)
1 Like

Thanks! Although you’re about 3 hours too late :slight_smile:

I have marked you as the solution as that is definitive.

Do you know how to recommend an edit to the documentation? Something like this would clarify it a lot, or at least I think so.

@timdnewman. Thanks, a GRU check was in order for me. Regarding the docs, I have no idea, but if you find out do let me know haha.

Here is the method - but it is a sufficiently long-winded proces that I’ll probably just people search for this :grin:

1 Like