Slump in accuracy during prediction


I have a PyTorch neural network and two datasets - train and dev. During training, after every epoch, I validate the model on the dev dataset. Also, during training, I save the model which has the best F1 accuracy(obtained during epoch validation) so far. The best model achieves around 66% accuracy.

After the training is complete, I perform prediction (using the best model saved during training) on the dev dataset. Here, I find that the accuracy slumps to 61%. I am unable to understand why this slump, even though the dataset used for validation(during training) and prediction is the same.

Please let me know if you have any clue on what I might have done wrong. I can also share more details if you would like to know.


Hello there,

Do you have any components in your model that change when the model is put into train() or eval() mode? Do you use any custom code to save and load weights? Different settings when you reload the weights? Also, got a link to the repo?