Dependence of the model output on the batch size

On the inference I noticed that output heatmap of my model depence of the batch size of the dataloader.
Here it is images:
The left image with batch_size=1, the right: batch_size=5(it is batch_size I used for training)

When I use the native batch size(5), the quality of the heatmap is better.
Here is my model:

I believe, the model should not depend of the batch size, but it does. Can you give me any suggestions about this? It is normal behavior of the model?
If you need, I can attach the code of my dataset and code of the inference my model.

@ushakovegor you are right when you say that the model should not depend on the batch size, but when you are updating gradients, anything can happen

  • If your dataset is skewed then model will oscillate during each batch execution

  • Batch size of 1 does not make sense as there will be very little room for the model to play around

  • If you use very large batch sizes then you will overfit

  • Most use a low batch_size of 16 and a high batch size of 64

Thanks for your response.The problems with this model behavior was related to BatchNorm. I forgot to switch model to eval mode. This switching helps me to resolve this problem.
This problem was discussed here too: