Model.eval() gives incorrect loss for model with batchnorm layers

smth · September 20, 2017, 5:00am

it is possible that your training in general is unstable, so BatchNorm’s running_mean and running_var dont represent true batch statistics.

Try the following:

change the momentum term in BatchNorm constructor to higher.
before you set model.eval(), run a few inputs through model (just forward pass, you dont need to backward). This will help stabilize the running_mean / running_std values.

Hope this helps.