LayerNorm == InstanceNorm?

I found the result of torch.nn.LayerNorm equals torch.nn.InstanceNorm1d, why?
testing code:

batch_size, seq_size, dim = 2, 3, 4
x = torch.randn(batch_size, seq_size, dim)

#layer norm
layer_norm = torch.nn.LayerNorm(dim, elementwise_affine=False)
print('y_layer_norm: ', layer_norm(x))
print('=' * 30)

# custom instance norm
eps: float = 0.00001
mean = torch.mean(x, dim=-1, keepdim=True)
var = var = torch.var(x, dim=-1, keepdim=True, unbiased=False)
print('y_custom: ', (x - mean) / torch.sqrt(var + eps))
print('=' * 30)

# instance norm
instance_norm = torch.nn.InstanceNorm1d(dim, affine=False)
print('y_instance_norm', instance_norm(x))
print('=' * 30)

# follow the description in https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html
# For example, if normalized_shape is (3, 5) (a 2-dimensional shape),
# the mean and standard-deviation are computed over the last 2 dimensions of the input (i.e. input.mean((-2, -1)))
mean = torch.mean(x, dim=(-2, -1), keepdim=True)
var = torch.var(x, dim=(-2, -1), keepdim=True, unbiased=False)
print('y_custom1: ', (x - mean) / torch.sqrt(var + esp))
print('=' * 30)

results:


In the results, I found LayerNorm equals InstanceNorm1d, and I custom the compute progress also found that the description in LayerNorm doc maybe not correct? Do I miss something or LayerNorm and InstanceNorm1d in pytorch are absolutely equal?
Hope someone can answer this question, thanks!

This seems like a bug of nn.InstanceNorm1d when affine=False.

nn.InstanceNorm1d should take an input of the shape (batch_size, dim, seq_size).
However, if affine=False, nn.InstanceNorm1d can take an input of the wrong shape (batch_size, seq_size, dim) and provides a LayerNorm-like result.

1 Like