Understanding GRU/LSTM layers parameter

timdnewman · October 25, 2022, 3:16pm

Hi,

I think I understand but I wanted to check how the layers parameter works in LSTM and GRU. Am I correct in thinking that the layers parameter is effectively a short-hand repetition i.e.

for a univariate time series, and GRUs with a hidden state size 10

self.gru = nn.GRU(input_size=1,hidden_size=10,num_layers=2)

Is equivalent to:

self.gru1 = nn.GRU(input_size=1,hidden_size=10)
self.gru2 = nn.GRU(input_size=10,hidden_size=10)

I think that is what the documentation is saying.

Thanks,

Tim

timdnewman · October 25, 2022, 3:18pm

For clarity the hidden size of 10 was picked at random, I think LSTM is similar but I only wrote up GRU as the example.

timdnewman · October 31, 2022, 11:41am

I’ll even take a yes or no answer so I know I should keep looking

timdnewman · November 3, 2022, 11:58am

If anyone comes looking at this - someone just asked a similar question and got an answer they are the same:

mvalente · November 3, 2022, 2:10pm

@timdnewman Run the following code on a jupyter to see what nn.GRU() does :
You’ll see that the prints have the same values.

import torch
import torch.nn as nn

gru = nn.GRU(input_size=1,hidden_size=1,num_layers=2, bias=False)
gru_0 = nn.GRU(input_size=1,hidden_size=1,num_layers=1, bias=False)
gru_1 = nn.GRU(input_size=1,hidden_size=1,num_layers=1, bias=False)

#save the parameters of the gru. 
params = list(gru.named_parameters())

# Assign the weights to the other grus.
gru_0.weight_ih_l0 = params[0][1]
gru_0.weight_hh_l0 = params[1][1]
gru_1.weight_ih_l0 = params[2][1]
gru_1.weight_hh_l0 = params[3][1]


input_tensor = torch.tensor([[3.]])
_, hs = gru(input_tensor)
print(hs)

out0, h0 = gru_0(input_tensor)
out1, h1 = gru_1(out0)
print(out0, out1)

timdnewman · November 3, 2022, 2:36pm

Thanks! Although you’re about 3 hours too late

I have marked you as the solution as that is definitive.

Do you know how to recommend an edit to the documentation? Something like this would clarify it a lot, or at least I think so.

mvalente · November 3, 2022, 2:46pm

@timdnewman. Thanks, a GRU check was in order for me. Regarding the docs, I have no idea, but if you find out do let me know haha.

timdnewman · November 3, 2022, 3:12pm

Here is the method - but it is a sufficiently long-winded proces that I’ll probably just people search for this

github.com

pytorch/pytorch/blob/master/CONTRIBUTING.md#writing-documentation

# Table of Contents

<!-- toc -->

- [Contributing to PyTorch](#contributing-to-pytorch)
- [Developing PyTorch](#developing-pytorch)
  - [Prerequisites](#prerequisites)
  - [Instructions](#instructions)
  - [Tips and Debugging](#tips-and-debugging)
- [Nightly Checkout & Pull](#nightly-checkout--pull)
- [Codebase structure](#codebase-structure)
- [Unit testing](#unit-testing)
  - [Python Unit Testing](#python-unit-testing)
  - [Better local unit tests with `pytest`](#better-local-unit-tests-with-pytest)
  - [Local linting](#local-linting)
    - [Running `mypy`](#running-mypy)
  - [C++ Unit Testing](#c-unit-testing)
  - [Run Specific CI Jobs](#run-specific-ci-jobs)
- [Merging your Change](#merging-your-change)
- [Writing documentation](#writing-documentation)

This file has been truncated. show original