shuffle=True makes the weird result

I am trying to make a prediction on stock data with a simple LSTM model.
I spilt first 80% data as train set, and last 20% data as test set.

The data shape is like ( batch_size , seq_len , feature_dims)

here is my code for data loader


when I set shuffle = false, everything work fine. (Left pics)

However, when I change shuffle to False, the prediction value become weird.(Right pics),the output value will stuck in a very small range.

Can anyone give me any suggestion? Thanks!

If shuffling interferes negatively with your training, it seems that your training routine or model depends on the sequential input of the data.
E.g. if you are using an RNN-like model, you might pass the batches sequentially and reusing the hidden states. In that case, shuffling the data might destroy the temporal correspondence.
You could try to shuffle “segments” of the data, but it depends of course on your actual use case.

1 Like