Output values are different when eval() mode

yongjun_Hong · February 7, 2023, 3:16am

Hello.

When I do following, o1 and o3 are different (Isn’t it should be the same values?)

Model contains only BatchNorm2d
If model is not updated in train mode (like below, no optimizer step), I think o1 and o3 should be same.
o2 can be different from o1/o3 due to BatchNorm2d
When is it possible o1 and o3 different ? (I think I did something wrong, but I can’t still figure it out)

Thanks!

x = torch.randn(1,3,224,224)

model = Model()
model.eval()

o1 = model(x)

model.train()
o2 = model(x)

model.eval()
o3 = model(x)

ptrblck · February 7, 2023, 3:47am

The outputs are computes as:

o1 will be computed using the running stats from batchnorm layers to normalize the corresponding input activations.
o2 will be computed using the activation stats in batchnorm layers to normalize the input activations. The running stats of batchnorm layers will be updated.
o3 will use the same approach as o1, but with the updated running stats from the previous forward pass.

Let me know, if this clarifies the use case.

yongjun_Hong · February 7, 2023, 4:18am

Yes, I think you are right.
I didn’t catch the batch norm running stats would be updated at forward pass.
So, o1 and o3 surely different due to updated batchnorm statistics at o2.

Thanks!