Instance norm implement by basic operations has different result comparing to torch.nn.InstanceNorm2d

zhousj · June 30, 2020, 2:44pm

i implement instance norm by pytorch basic operations. but the result is different from torch.nn.InstanceNorm2d. Can anyone help me out? Below is my code:

##########################
import torch
import numpy as np
x = torch.rand((8, 16, 32, 32))
a = torch.nn.InstanceNorm2d(256)
a.eval()
with torch.no_grad():
b = a(x)
x_mean = torch.mean(x, axis=(2,3), keepdims=True)
x_var = torch.var(x, axis=(2,3), keepdims=True)
x_norm = (x - x_mean) / torch.sqrt(x_var + 1e-5)
b_numpy = b.numpy()
x_norm_numpy = x_norm.numpy()
# check if b_numpy and x_norm_numpy close to the torlerance of 1e-3
print(np.allclose(b_numpy, x_norm_numpy, atol=1e-3))
# check if b_numpy and x_norm_numpy close to the torlerance of 1e-3
print(np.allclose(b_numpy, x_norm_numpy, atol=1e-4))

##########################
result:
True
False
##########################

so, the result shows that when precision comes to 1e-4, they are different. i don’t know why. can anyone help me to get a more close result to torch.nn.InstanceNorm2d? Thanks in advance!

BTW. The reason why I do not use formula gamma * x_normalized_numpy + beta in the paper is i find that when the first initialization of torch.nn.InstanceNorm2d, all gamma is initialized to [1.0, 1.0, 1.0, …] and all beta is initialized to [0.0, 0.0, 0.0,…]. So, under this condition, x_normalized_numpy = gamma * x_normalized_numpy + beta

ptrblck · July 2, 2020, 10:16am

I think nn.InstanceNorm2d might use the biased variance, so changing the variance calculation to:

x_var = torch.var(x, unbiased=False, axis=(2,3), keepdims=True)

should work (I haven’t checked the source code of nn.InstanceNorm2d, but batchnorm layers use the biased variance, so I just tested your code quickly ).

zhousj · July 3, 2020, 8:44am

thank you so much @ptrblck . you are right. this can reduce the difference to the tolerance of 1e-6. I think this result is acceptable.