Prediction accuracy of a trained model is much worse in testing than validation, both on the same validation set

@ptrblck Thank you so much for replying!
I’m not sure what do you mean by “the usage of model.eval() is different”. For both validate and predict function (which is my testing), I follow the same order:

model.to(device)
model.eval()
for data  in dataloader:
    ...
    ...

with torch.no_grad():
   outputs =  model(images)

Should I do it differently for the predict function?

I can understand if the testing result is a bit lower than the validation result, but dropping from 87% to 1x%, it’s too dramatic, right? It shouldn’t be something can be fixed by simply tuning the momentum?

I saw your comments regarding the use of model.eval() in another post. But I’m not sure it’s my case, may be I don’t fully understand your another post. Sorry if I made you repeating stuff.