# Why InstanceNorm2d doesn't follow the formula?

In a simple test, using the original formula for InstanceNorm2d in here:

eps = 1e-5
N = nn.InstanceNorm2d(1, eps=eps)
x = torch.arange(25).view(1, 1, 5, 5).float()
y1 = N(x)
y2 = (x - x.mean())/torch.sqrt(x.var() + eps)

The error is quite big:

y2-y1
tensor([[[[ 0.0336, 0.0308, 0.0280, 0.0252, 0.0224],
[ 0.0196, 0.0168, 0.0140, 0.0112, 0.0084],
[ 0.0056, 0.0028, 0.0000, -0.0028, -0.0056],
[-0.0084, -0.0112, -0.0140, -0.0168, -0.0196],
[-0.0224, -0.0252, -0.0280, -0.0308, -0.0336]]]])

But the result is consistent. I get the same in CPU and GPU.
What is the right formula for InstanceNorm2d?

If youâ€™re into finding out yourself: Can you spot the pattern in the deviation you see? Did you notice they both have mean 0? Look at y1/y2.

It does use the same formula but the formula does not say which variance estimate to use, so you picked the one thatâ€™s not the one `*Norm` uses: you have `unbiased=True`, `*Norm` uses `unbiased=False`. If you do fix this, youâ€™ll get the same up to numerical precision.

Best regards

Thomas

1 Like

I find that running_mean/varâ€™s shape is C, not N * C in torch.nn.instancenorm, but according to the https://arxiv.org/pdf/1607.08022.pdf , the shape of mean/var is N * C .
And from the page in pytorch/Normalization.cpp at master Â· pytorch/pytorch Â· GitHub,
the code â€śat::alias(running_mean).copy_(running_mean_.view({ b, c }).mean(0, false)); }â€ť
i donâ€™t know the operation of â€śmean(0,false)â€ť is use for and why?