Why would one shuffle the test data?

I’m aware uniformly shuffling training data results in good batches, and hence helps training.

However, is there really a point in shuffling test data?

Shuffling data prior to Train/Val/Test splitting serves the purpose of reducing variance between train and test set.

Other then that, there is no point (that I’m aware of) to shuffle the test set, since the weights are not being updated between the batches.

Do you have a specific use case when you encountered shuffled test data?


Your test/validation data is meant to measure how good your model is. Hence, choosing to shuffle or no not your test/validation data set depends on what you are choosing to measure.
For example, if you are extracting a information that is batch dependent (e.g mean loss on the batch) or you have a component in your neural network that is batch dependent, then you should shuffle for consistency.