Sliding window inside batch created with DataL batch_first=?

I have following data; sequence of 20653 points (time sequence) with 18 futures/labales each.
I created sliding window of size 8 and splitted into test and training.
Received [14386, 8, 18] and [6161,8,18] (X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, random_state=RANDOM_SEED, shuffle=False)
Now decided to use DataLoader to create batches of size 288.
Received 50 tesnors of size [288,8,18] (loader=DataLoader(my_data, batch_size=288, shuffle=False, drop_last=Flase)).
Before creating batches i have been using the following lstm -> self.lstm = nn.LSTM(input_size, hidden_size, num_layers, bidirectional=True, dropout=0.2, batch_first=False) Now my question is, since DataLoader returns batch size as first variable [288,8,18] should i use batch_first=True in lstm. It is not clear to me since i use also sliding window here. Just wanted to add that in cause batch_first=True the network does not learn - no improvement in results…so i will stick to batch_first=false however my understanding is at beast weak