Dependence of the model output on the batch size

ushakovegor · February 7, 2022, 3:54pm

Hello,
On the inference I noticed that output heatmap of my model depence of the batch size of the dataloader.
Here it is images:
images
The left image with batch_size=1, the right: batch_size=5(it is batch_size I used for training)

When I use the native batch size(5), the quality of the heatmap is better.
Here is my model: https://www.dropbox.com/s/ms3mvh030t7n74a/latest3.pth

I believe, the model should not depend of the batch size, but it does. Can you give me any suggestions about this? It is normal behavior of the model?
If you need, I can attach the code of my dataset and code of the inference my model.

anantguptadbl · February 7, 2022, 7:35pm

@ushakovegor you are right when you say that the model should not depend on the batch size, but when you are updating gradients, anything can happen

If your dataset is skewed then model will oscillate during each batch execution
Batch size of 1 does not make sense as there will be very little room for the model to play around
If you use very large batch sizes then you will overfit
Most use a low batch_size of 16 and a high batch size of 64

ushakovegor · February 8, 2022, 10:48am

Hello,
Thanks for your response.The problems with this model behavior was related to BatchNorm. I forgot to switch model to eval mode. This switching helps me to resolve this problem.
This problem was discussed here too:
https://discuss.pytorch.org/t/output-varies-when-changing-batch-size-during-test/10439