If you’re into finding out yourself: Can you spot the pattern in the deviation you see? Did you notice they both have mean 0? Look at y1/y2.
It does use the same formula but the formula does not say which variance estimate to use, so you picked the one that’s not the one *Norm uses: you have unbiased=True, *Norm uses unbiased=False. If you do fix this, you’ll get the same up to numerical precision.
I find that running_mean/var’s shape is C, not N * C in torch.nn.instancenorm, but according to the https://arxiv.org/pdf/1607.08022.pdf , the shape of mean/var is N * C .
And from the page in pytorch/Normalization.cpp at master · pytorch/pytorch · GitHub,
the code “at::alias(running_mean).copy_(running_mean_.view({ b, c }).mean(0, false)); }”
i don’t know the operation of “mean(0,false)” is use for and why?