What is the reason of rnn's batch_first parameter is set False as default?

Hello every one. I’m beginner at nlp. i’m learning about how use nn.RNN with pytorch docs.

before start, i’m not good at english grammar XD )

in docs, nn.RNN’s parameter ‘batch_first’ is False in default.

it seems that people use batch_first == False ususally. ( i saw a lot of input datasets are same like that not only at simple nlp model but also at complicated nlp model(e.g bert) )

Can you let me know why people uses dataset’s shape as T x B x * (T is max length) ??

is there any advantages when i set dataset’s shape like that?

Thank you!!

There’s no strict advantages or disadvantages. It could be that because, in an RNN, we’re iterating over the sequence dimension (we take timestep-0, then timestep-1, etc.) so it makes “sense” to have that dimension first. But it doesn’t really make a difference.

I would suggest just going with the default option only because it’s default.


Thank you for your helpful answer :smiley: