# Numerical stability of Normal Torch torch.randn

So I’ve been testing `torch.rand` and I am not sure if I am doing something wrong. But the routine seems to suffer from some numerical instability. Here is a notebook for replicating my results in colab: Google Colab
I am using a T4-GPU

``````import torch

h = torch.randn(100000000, dtype=torch.float64, device='cuda')
eps =  torch.finfo(torch.float64).eps
print(f'Eps number',eps)
print(f'Mean {h.mean(dtype=torch.float64)}')
print(f'Std {h.std()}')

``````

Output:

``````Eps number 2.220446049250313e-16
Mean 3.946074394742064e-05
Std 0.9999776430693459

``````

Though the mean is close to zero, it is farway from the reported `eps`. It’s around a factor of 3 bigger than the eps. I ran some bootstrap simulations and it seems to not be biased. Ran thousand simulations of one million points.

Mean:

Std:

I also computed the quantiles, and you can see 99% of the simulations is far from zero at least 3 times the eps value.

Hi Uumami!

This is to be expected

The mean of a bunch of (independent, identically-distributed) random
numbers has standard deviation `sigma / sqrt (n)`, where `sigma` is
the standard deviation of the individual samples and `n` is the number of
samples over which you take the mean.*

`randn()` generates samples that have standard deviation `1.0` and since
you take the mean of `1.e8` samples, the standard deviation of your mean
will be `1.e-4`. This means that you don’t expect your mean to be “very
close” to zero, but, rather, “about” `1.e-4` away from zero.

So the mean you calculate of `3.946e-5` is a perfectly expected value.

*) This is discussed many places: One reference would be Wikipedia’s
Bienaymé’s identity. (Don’t let the generality of this reference put you off;
the simple version we are using is a well-known result from elementary
statistics.)

Best.

K. Frank

1 Like