# Input shape to GRU layer

Greetings. I am getting confused about the input shape to GRU layer.
I have one batch of 128 images and I extracted 9 features from each images.
So now my shape is `(1,128,9)`.
Here 1 is batch, 128 images in each batch and 9 features of each images. The images is in sequence, for example 128 frame of a video. So simply one batch represent one video.

This is the GRU layer

``````gru=torch.nn.GRU(input_size=128,hidden_size=8,batch_first=True)
``````

Question 1: Is the `input_size=128` correctly defined?
Here is the code of forward function

``````def forward(features):
features=features.permute(0,2,1)#[1, 9, 128]
x2,_=self.gru(features)
``````

Question 2: Is the code in forward function is correctly defined?

Question 3: In shape `(1,128,9)`, 9 is sequence length and 128 is the input. Is this correct or its the opposite way?

Thanks

You also need a hidden state of size (num_layers, batch_size, output_size), which can start out as zeros and evolve at each time step.

See here for an example:

https://pytorch.org/docs/stable/generated/torch.nn.GRU.html

I saw you updated your question. This is the example given on the docs page:

``````rnn = nn.GRU(10, 20, 2)
input = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
output, hn = rnn(input, h0)
``````

Note that you’re still missing the hidden state `h0` in your code.

The GRU layer expects two inputs. The hidden input is how you convey contextual information to the model between timesteps.

Q1. No, that is incorrect to have batch size at `dim=1` when you have clearly set `batch_first=True`. You have to choose how you want to send your data in and get it out, then set that argument accordingly. If you set `batch_first=True`, your input size should be (128, 1, 9).

Q2. No. You’re not giving the model a hidden state.

Q3. The order can be N, L, H with `batch_first=True` or L, N, H if `False` where L is length, N is batch size and H is number of features.

Last issue, when you instantiated the GRU class, you put the input size as 128, which you have indicated to be your batch size. That is incorrect. Input size should equal the number of features, in this case, 9.

`input_size` refers to the number of features, in your case 9. You don’t specify the sequence length when defining an `nn.GRU` or `nn.LSTM`. Recurrent Neural Networks go through each item in your sequence no matter how long.

`hidden_size` is the size of the internal state of your `nn.GRU`. You probably want to set it to a larger value than 9 :), is it might have to capture a lot of information.

So you should define your recurrent layer as follows:

``````gru=torch.nn.GRU(input_size=9, hidden_size=256, batch_first=True) # The 256 is just a suggestion
``````

Since your set `batch_first=True`, you `gru` layer expects an input of `(batch_size, seq_len, input_size)`, which in your example would be `(1, 128, 9)`. As this seems already to be your shape, there should be no need to permute the dimensions.

@J_Johnson You don’t have to explicitly give an initial hidden state as input – although it’s probably a good practice to always do so.

If you don’t specify an initial hidden state, it gets initialized with a 0-vector. You can check the source code. The signature for the `forward()` method is

``````def forward(self, input: Tensor, hx: Optional[Tensor] = None) -> Tuple[Tensor, Tensor]:
``````

So `hx` is optional. If it is `None`, then it gets initialized directly in the `forward()` method as follows: `hx = torch.zeros(...)`

I just assume someone making use of a GRU with a sequence length of 1 intends to pass in later timesteps via their model’s forward method. So while that might not give any errors, it sure would negate any benefit of using an RNN class layer.

I thought 1 is the `batch_size` is the `seq_len`. But, yeah, no reading “I have a batch of 128 images” again, I’m not sure sure anymore 1 Like

Hi, Thanks for the detailed answer. sorry my bad. actually my batch size is 1, each batch has 128 images, and each image gas 9 features.
Second thing if we dont defined hidden state, wont it take it automatically, because it did not throw any error? thanks

apology my side. my batch size is 1, each batch has 128 images and each image has 9 features. for example if i have 2 batch size then the shape would be `(2,128,9)`