Hi,
In the seq-2-seq tutorial provided in the docs (http://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html#the-seq2seq-model), I am unable to understand the below part:
for i in range(self.n_layers):
output, hidden = self.gru(output, hidden)
What is the use of n_layers here? Is it for stacked GRU’s? In that case, shouldn’t the hidden representation being sent in the next iteration be different?
Thanks