Performance highly degraded when eval() is activated in the test phase

@Valerio_Biscione Thank you!! fixed my low accuracy on eval, when using a smaller batch size.
In my case I didn’t have direct access to the model class, so I couldn’t initialize batch norm with track_running_stats=False. As you rightly mentioned the latest commit checks the batch norm stats to decide if it is in training mode or eval mode, so I set the mean and var variables in the batch norm to None and it worked out perfectly.

for m in model.modules():
    for child in m.children():
        if type(child) == nn.BatchNorm2d:
            child.track_running_stats = False
            child.running_mean = None
            child.running_var = None

model.eval()
3 Likes

I have encountered the same problem.
Simply say, the model seems trained well and the loss is as expected during training.
While testing, after setting model.eval(), the result seems bad and the loss is high.
Using model.train(), or set m.track_running_stats = False really improve the result, however, if i evaluate the model with batch_size=1, the result is bad again.
Then i check my code and find the batch_norm layer’s affine is set False, I think probably it causes the problem. Now I am retraining my model with affine=True. I will report the result.
BTW, if you encounter the same problem, and your batch_norm 's affine is False, I think it may be the reason.

I have same problem.
I try to overfit and train only one picture and validate it at the same time. The output of training is very good, the output of validation is bad. Then I set:
cudnn.deterministic = True
It worked. I speculated that the cudnnbatchmark algorithm causes the result of each forward to be different.

1 Like

Thanks for your answer! This works for me! It has to be False when creating it, i.e. BatchNorm(…, track_running_stats=False).

This resolves my issue. I just applied this during the test time, it performs well. Thank you.