I have some question about normalization technique.
I encounter very weird phonomena doing normalization.
x = torch.rand(10, 10) print((x - x.mean(dim=1, keepdim=True)).mean(dim=1, keepdim=True))
It doesn’t give zero tensor! The result is very small but it is not zero.
What is going on???
This is numerical precision - floating point arithmetic isn’t exact, so you will not get the exact mean. If you switch to double, you get from 5e-8ish to 5e-17ish in your example. This can be more substantial if you have larger tensors.
Best regards
Thomas