DataLoader parameter "shuffle" affects model accuracy

Hello ,
The parameter shuffle in DataLoader class seems to affect the model in some way.
I have a saved model for a binary classification task (cats vs dogs) and changing the parameter shuffle in DataLoader affects my model heavily.
I have used torch.save method to save my trained model , and I used torch.load method to load it in another python file.
When shuffle is set to False in DataLoader , the model gives around 52% accuracy but the saved model had about 98% accuracy during validation tests.
The model only performs how it is supposed to perform is when shuffle is set to True in the prediction python file I use.
The validation set used for testing accuracy of the model while changing shuffle parameter is the same.
Below is a snippet of code which calculates accuracy of the model.
I would also like to note that the model is a pretrained model called resnet18 , i only trained the fc layer in order to fit it to my task.

test_loader = DataLoader(test_set , batch_size=64 , shuffle=False)

total_correct = 0
total_seen = 0
for xb,yb in test_loader:
    xb=xb.cuda()
    yb=yb.cuda()
    preds = model(xb.float())
    total_correct += ((torch.sum(preds.round() == yb.reshape(-1,1))).item())
    total_seen += yb.numel()
print(total_correct / total_seen)

When shuffle is set to False , like the above snippet , the accuracy i get is 0.5284231339594662
When shuffle is set to True , the accuracy i get is 0.957983193277311
This change is only by changing shuffle to True.
I don’t understand why or how DataLoader affects my pretrained model’s accuracy , please help me out.

Also , somehow using model.eval() solves the above problem.

This would be expected, since calling model.eval() would disable dropout layers (shouldn’t make a difference regarding shuffling the dataset) and would use the running stats of all batchnorm layers.
If you leave the model in training mode, the batchnorm layers would use the current batch stats to normalize the inputs and would also update the running stats. Shuffling the dataset is thus making a difference, which is why you should call model.eval() during validation and testing.

1 Like