timdnewman
(Timdnewman)
October 25, 2022, 3:16pm
1
Hi,
I think I understand but I wanted to check how the layers parameter works in LSTM and GRU. Am I correct in thinking that the layers parameter is effectively a short-hand repetition i.e.
for a univariate time series, and GRUs with a hidden state size 10
self.gru = nn.GRU(input_size=1,hidden_size=10,num_layers=2)
Is equivalent to:
self.gru1 = nn.GRU(input_size=1,hidden_size=10)
self.gru2 = nn.GRU(input_size=10,hidden_size=10)
I think that is what the documentation is saying.
Thanks,
Tim
timdnewman
(Timdnewman)
October 25, 2022, 3:18pm
2
For clarity the hidden size of 10 was picked at random, I think LSTM is similar but I only wrote up GRU as the example.
timdnewman
(Timdnewman)
October 31, 2022, 11:41am
3
I’ll even take a yes or no answer so I know I should keep looking
timdnewman
(Timdnewman)
November 3, 2022, 11:58am
4
If anyone comes looking at this - someone just asked a similar question and got an answer they are the same:
I’m working on time series problems.
On the web I saw 2 type of models:
Models which used one GRU with multiple layers.
nn.GRU(num_layers=4,…)
Models which used multiple GRU's (one after the other) with less (or equal) number of layers.
nn.GRU(num_layers=2, ...)
nn.GRU(num_layers=2, ...)
I understand that using one GRU or multiple GRU change the number of wights to calculates and to optimize,
but I didn’t found any explanation for the benefits between the 2 options above:
What is …
mvalente
(Miguel Valente)
November 3, 2022, 2:10pm
5
@timdnewman Run the following code on a jupyter to see what nn.GRU() does :
You’ll see that the prints have the same values.
import torch
import torch.nn as nn
gru = nn.GRU(input_size=1,hidden_size=1,num_layers=2, bias=False)
gru_0 = nn.GRU(input_size=1,hidden_size=1,num_layers=1, bias=False)
gru_1 = nn.GRU(input_size=1,hidden_size=1,num_layers=1, bias=False)
#save the parameters of the gru.
params = list(gru.named_parameters())
# Assign the weights to the other grus.
gru_0.weight_ih_l0 = params[0][1]
gru_0.weight_hh_l0 = params[1][1]
gru_1.weight_ih_l0 = params[2][1]
gru_1.weight_hh_l0 = params[3][1]
input_tensor = torch.tensor([[3.]])
_, hs = gru(input_tensor)
print(hs)
out0, h0 = gru_0(input_tensor)
out1, h1 = gru_1(out0)
print(out0, out1)
1 Like
timdnewman
(Timdnewman)
November 3, 2022, 2:36pm
6
Thanks! Although you’re about 3 hours too late
I have marked you as the solution as that is definitive.
Do you know how to recommend an edit to the documentation? Something like this would clarify it a lot, or at least I think so.
mvalente
(Miguel Valente)
November 3, 2022, 2:46pm
7
@timdnewman . Thanks, a GRU check was in order for me. Regarding the docs, I have no idea, but if you find out do let me know haha.
timdnewman
(Timdnewman)
November 3, 2022, 3:12pm
8
Here is the method - but it is a sufficiently long-winded proces that I’ll probably just people search for this
1 Like