Output varies when changing batch size (during test)

I’m using a resnet18 network from torchvision.models.

During test time, I observe If I alter the batch size for the data loader, the accuracy of the model on test data changes. I can not comprehend why should this happen. Isn’t the case that the weights of the network are fixed during testing(I haven’t called any optimizer.step())? I also went through resnet’s architecture, there is no randomized output at any layer.

Any pointers where I could be wrong in my code or my information?

I think it’s because of the BatchNorm layer. The effect was more visible with smaller batch sizes. Accuracy dropped from 99.7% for a batch size of 25 to 14.85 % for a batch size of 1.

Have to set the model to evaluation mode with model.eval()? The batch size should not change the predictions!

1 Like

Thanks! After model.eval() is called, the model starts using running mean and variance for normalization. I never went through this part of the documentation.

Hey ! I got the same issue on my code. The results are very bad with a batch size of 1, which is not very practical for evaluation of a single image.

I put model.eval() as well as

    for child in model.children():
            if type(child) == nn.BatchNorm2d:
                child.track_running_stats = False

EDIT: does eval makes the use of the running stats but do not save the variance and mean ? Are the variance and mean supposed to scale with the batch size ?

Any ideas ? The network is a stock ResNet adapted for regression instead of classification

This one-liner basically saved my day. Thanks a lot!