What's the advantage of [seq_len, batch, input_size] in RNN?

I’m learning about the RNN’s staff, it confused me when I want to input data to RNN(RNNBase) model .
The input data should be organized in this form [seq_len, batch, input_size] .

And we should set batch_first ==True to change it to [ batch,seq,feature] .
Supposed that the author is a genius and why would he write source code like this ?
Is there any advantage of batch_Second ?