I have some question about normalization technique.

I encounter very weird phonomena doing normalization.

x = torch.rand(10, 10) print((x - x.mean(dim=1, keepdim=True)).mean(dim=1, keepdim=True))

It doesn’t give zero tensor! The result is very small but it is not zero.

What is going on???

This is numerical precision - floating point arithmetic isn’t exact, so you will not get the exact mean. If you switch to double, you get from 5e-8ish to 5e-17ish in your example. This can be more substantial if you have larger tensors.

Best regards

Thomas