Difference in accuracy when setting model.eval() and not setting it

Hi all,
I am working on a semantic segmentation problem. I train my model using UNet like architecture, and save the model weights. So, now when I try to generate segmentations from this saved model, basically the inference part, I am facing an unusual issue. When I set model.eval() before the inference loop, my dice score is 0.4ish and when I don’t set it (I think, in this case it defaults to model.train()), dice score is around 0.6ish. This has perplexed me. I know that model.eval() switches off the Batchnorm and Dropout layers, but did not know that there can be such variation in accuracy. Can someone help me with this?

Did you see the same effect during training by observing the training and validation loss (which would correspond to calling model.train() and model.eval()) or are you only seeing this issue after loading the trained model?