Why does nn.RNN's default input is (seq_len, batch, input_size) instead of (batch, seq_len, input_size)?

Is there some performance gain?
It’s more natural to me to have the batch size as the first dimension.

I guess in RNN at every time insantance what will be update is
So more natural seems to be their choice.