Thanks for your question! It appears that model.eval()
with batchnorm has historically had some confusion, see for example: Model.eval() gives incorrect loss for model with batchnorm layers - #11 by meetshah1995.
Could you try to follow the advice in that thread to see if it improves the eval accuracy? Also, what sort of accuracy do you get if you disable model.eval()?