# Batch normalization different between .eval and .train modes even when running and batch statistics are the same

Hey,
I know batch normalization uses different statistics on eval() mode and train() mode, but, when
I make those statistics the same, it still gives me different values.

Here’s the code to reproduce what I am talking about

``````import torch
import torch.nn.functional as F
import torch.nn as nn

a = torch.tensor([[[1,2],
[3,4]]]).float()

b = torch.tensor([[[10,20],
[30,40]]]).float()

X = torch.stack((a,b)).float()
assert X.shape == (2,1,2,2)

l = nn.BatchNorm2d(1, momentum=1, eps=0).train() #1 channel
l(X) #calculate running_means and running_var

def batchnorm(x,u,var):
return (x - u)/(torch.sqrt(var)) #epsilon is 0

one = batchnorm(X,l.running_mean, l.running_var)

# batch norm eval
l.eval()
two = l(X)

# batch norm train
l.train()
three = l(X)

assert (torch.abs(one - two) < 0.0001).all()
assert not (torch.abs(one - three) < 0.0001).all()
``````

This code should run without problems because tensor one and three are different when I
think they should be equal.

Thank you

Your code returns the expected mismatches:

``````torch.abs(one - two)
Out[15]:
tensor([[[[5.9605e-08, 0.0000e+00],
[5.9605e-08, 5.9605e-08]]],

[[[0.0000e+00, 0.0000e+00],

torch.abs(one - three)
Out[16]:
tensor([[[[0.0598, 0.0551],
[0.0504, 0.0457]]],

[[[0.0176, 0.0293],
``````

The difference is expected as the `running_var` will be updated with Bessel’s correction as seen here.

``````print(l.running_mean)
# tensor([13.7500])
print(X.mean([0, 2, 3]))
# tensor([13.7500])

print(l.running_var)
# tensor([216.7857])
print(X.var([0, 2, 3], unbiased=False))
# tensor([189.6875])
print(X.var([0, 2, 3], unbiased=False) * X.numel() / (X.numel() - 1))
# tensor([216.7857])
``````