Question about RNN output and hidden state!

davidlee · September 28, 2019, 1:23pm

I have a input which shape is like (sequence_length=10, batch=32,input_size=1000)

gru = nn.GRU(input_size=1000, hidden_size= 200)

input = torch.randn(10,32,1000)
hidden = torch.randn(1, 32, 200)

gru_output, gru_hidden = gru(input, hidden)

gru_output = []
for i in range(10):
input, hidden = gru(input[i], hidden)
gru_output.append(input)

Both cases (1,2) yield same size of gru_output (10,32,200)

But one is just putting input at the same time and the other is putting one sequence by one.
Are there any different results(output or hidden state of the GRU) between two cases??

vdw · September 28, 2019, 1:38pm

It should be different. In Case 1, all input sequence in the batch start with the same hidden state and are independent. In Case 2, the last hidden state of Step i becomes the initial hidden state for Step i+1. So each sentences in the batch start with a different hidden batch.

dejanbatanjac · September 28, 2019, 3:02pm

@davidlee, please edit your question by wrapping code into ``` block.