It is not clear how does the batching happen in the Language model?
It is not clear if it the input to the model in every iteration of the loop is [seq_length, batch_size, embed_size] or [batch_size, seq_length, embed_size]?
Also, why does rnn model return output and hidden separately, they are the same… as for a rnn layer hidden itself is the output.
I also opened a github issue: https://github.com/pytorch/examples/issues/286 for this case.
Thanks for the awesome library.