I have a input which shape is like (sequence_length=10, batch=32,input_size=1000)
gru = nn.GRU(input_size=1000, hidden_size= 200)
input = torch.randn(10,32,1000)
hidden = torch.randn(1, 32, 200)
gru_output, gru_hidden = gru(input, hidden)
gru_output = 
for i in range(10):
input, hidden = gru(input[i], hidden)
Both cases (1,2) yield same size of gru_output (10,32,200)
But one is just putting input at the same time and the other is putting one sequence by one.
Are there any different results(output or hidden state of the GRU) between two cases??