LSTM hidden shape problem

If i have nn.LSTM with hidden size of 400, one layer, and batch_first = True, then h_0 and c_0 inputs should be of shape (batch_size,1,400), or do i not understand something correcty? Since it throws me an error:

Expected hidden[0] size (1, 2, 400), got (2, 1, 400) (My batch_size is 2)

The shape of the hidden state is not affected by the value of batch_first. The shape is always:

(num_layers * num_directions, batch_size, hidden_dim)