Batch size and shuffle affecting evaluation

Fling · March 3, 2023, 5:51am

Would batch size/order affect the behavior of BatchNorm or any other layer when in eval mode?

I have a model trained with batch size 16, and when I evaluate at batch size 16, I get my expected results. When I change the value of batchsize in evaluation, as the batch size decreases, results get worse.

Likewise, shuffling the Dataloaders in evaluation improves results. I assume it’s because shuffled batches are more balanced than the fixed order files, but I don’t know why that would affect results.

I think this is an error in my code but I’d like to ask if I’m misunderstanding intended behavior before I start working on it. I’m using Google Colab and using the same network as the one here, if that is relevant.

ptrblck · March 3, 2023, 7:34am

No, the batch size should not have any effect on BatchNorm layers during eval() besides expected small errors potentially due to the limited floating point precision caused by a different order of operations.
Your model also works for me and doesn’t show any difference:

model = UNet(3, 10)
model.eval()

x = torch.randn(10, 3, 224, 224)
# create reference with full batch
out = model(x)

# iterate all samples separately
for idx, x_ in enumerate(x):
    x_ = x_.unsqueeze(0)
    out_single = model(x_)
    print((out_single.squeeze(0) - out[idx]).abs().max())

# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)
# tensor(0., grad_fn=<MaxBackward1>)

Fling · March 3, 2023, 8:22am

Thanks for the help!

Your mentioning precision errors was very helpful. I found that the code I copied was using a long to store the mask values so the precision errors were quite high