Hello ,
The parameter shuffle in DataLoader class seems to affect the model in some way.
I have a saved model for a binary classification task (cats vs dogs) and changing the parameter shuffle in DataLoader affects my model heavily.
I have used torch.save method to save my trained model , and I used torch.load method to load it in another python file.
When shuffle is set to False in DataLoader , the model gives around 52% accuracy but the saved model had about 98% accuracy during validation tests.
The model only performs how it is supposed to perform is when shuffle is set to True in the prediction python file I use.
The validation set used for testing accuracy of the model while changing shuffle parameter is the same.
Below is a snippet of code which calculates accuracy of the model.
I would also like to note that the model is a pretrained model called resnet18 , i only trained the fc layer in order to fit it to my task.
test_loader = DataLoader(test_set , batch_size=64 , shuffle=False)
total_correct = 0
total_seen = 0
for xb,yb in test_loader:
xb=xb.cuda()
yb=yb.cuda()
preds = model(xb.float())
total_correct += ((torch.sum(preds.round() == yb.reshape(-1,1))).item())
total_seen += yb.numel()
print(total_correct / total_seen)
When shuffle is set to False , like the above snippet , the accuracy i get is 0.5284231339594662
When shuffle is set to True , the accuracy i get is 0.957983193277311
This change is only by changing shuffle to True.
I don’t understand why or how DataLoader affects my pretrained model’s accuracy , please help me out.
Also , somehow using model.eval() solves the above problem.