Why InstanceNorm2d doesn't follow the formula?

In a simple test, using the original formula for InstanceNorm2d in here:

eps = 1e-5
N = nn.InstanceNorm2d(1, eps=eps)
x = torch.arange(25).view(1, 1, 5, 5).float()
y1 = N(x)
y2 = (x - x.mean())/torch.sqrt(x.var() + eps)

The error is quite big:

y2-y1
tensor([[[[ 0.0336, 0.0308, 0.0280, 0.0252, 0.0224],
[ 0.0196, 0.0168, 0.0140, 0.0112, 0.0084],
[ 0.0056, 0.0028, 0.0000, -0.0028, -0.0056],
[-0.0084, -0.0112, -0.0140, -0.0168, -0.0196],
[-0.0224, -0.0252, -0.0280, -0.0308, -0.0336]]]])

But the result is consistent. I get the same in CPU and GPU.
What is the right formula for InstanceNorm2d?

If you’re into finding out yourself: Can you spot the pattern in the deviation you see? Did you notice they both have mean 0? Look at y1/y2.

It does use the same formula but the formula does not say which variance estimate to use, so you picked the one that’s not the one *Norm uses: you have unbiased=True, *Norm uses unbiased=False. If you do fix this, you’ll get the same up to numerical precision.

Best regards

Thomas

1 Like

I find that running_mean/var’s shape is C, not N * C in torch.nn.instancenorm, but according to the https://arxiv.org/pdf/1607.08022.pdf , the shape of mean/var is N * C .
And from the page in https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/Normalization.cpp#L581,
the code “at::alias(running_mean).copy_(running_mean_.view({ b, c }).mean(0, false)); }”
i don’t know the operation of “mean(0,false)” is use for and why?