The sum() function is inaccurate when the CPU evaluates the float32 type with precision

Zebulun · December 5, 2020, 9:58am

version: pytorch==1.3.0
code：

import torch
input = torch.randn((2, 128, 10, 6), dtype=torch.float32)
out = input.sum()
print("%3.10f" % out.data)
<< 0.0181007385

Compute comparison tests using the same data：
sum：0.0179971529
numpy： 0.017990112. (RE=0.04%)
torch cuda：0.0179862976. (RE=0.06%)
torch cpu：0.0181007385 （RE=0.57%）

The error of summation of float32 type data is normal, and the precision is close compared with other computed data. However, the calculation results of the Torch CPU vary greatly. How to determine the accuracy of the data?

Is it because the accumulated errors are processed by other computing libraries, while the Torch CPU does not have higher errors? Or is the processing of TorchCPU a special process?
Where can I see the implementation of the cumulative calculation?

utkuumetin · December 5, 2020, 12:08pm

Hi! I’m not very familiar and good with the subject, but maybe this issue page can help you:

KFrank · December 5, 2020, 6:00pm

Hi Zebulun!

As Utku suggested with his link, the imprecision you see is expected
as it is consistent with 32-bit floating-point round-off error.

You are summing over about 15,000 normally distributed values, so
the expected size of your sum is about 124. Let’s treat your numpy
result as the exact result. Then your cpu result is off in the fourth
place after the decimal point, so relative to the expected value, this
is fully consistent with round-off error.

However, there is something fishy going on with your numbers. The
value you get for your sum is improbably small. As I mentioned
above, the expected size of your sum is about 124, so a value of
0.018 is very unlikely – about one chance in ten thousand. I can’t
think of any good reason this should happen, and it would be pushing
your luck to suggest that luck explains it.

Here is a script that goes through the calculations relevant to these
two points:

import numpy as np

tdim = (2, 128, 10, 6)
sumcpu = 0.0181007385
sumnpy = 0.017990112

nSum = np.product (tdim)
expected_size = np.sqrt (nSum)

relerr = (sumcpu - sumnpy) / expected_size

print ('nSum =', nSum, ', expected_size =', expected_size)
print ('relerr =', relerr)

import math

prob_small = math.erf (abs (sumnpy) / (math.sqrt (2.0) * expected_size))
print ('prob_small =',  prob_small)

And here is its output:

>>> import numpy as np
>>>
>>> tdim = (2, 128, 10, 6)
>>> sumcpu = 0.0181007385
>>> sumnpy = 0.017990112
>>>
>>> nSum = np.product (tdim)
>>> expected_size = np.sqrt (nSum)
>>>
>>> relerr = (sumcpu - sumnpy) / expected_size
>>>
>>> print ('nSum =', nSum, ', expected_size =', expected_size)
nSum = 15360 , expected_size = 123.93546707863734
>>> print ('relerr =', relerr)
relerr = 8.92613733644216e-07
>>>
>>> import math
>>>
>>> prob_small = math.erf (abs (sumnpy) / (math.sqrt (2.0) * expected_size))
>>> print ('prob_small =',  prob_small)
prob_small = 0.00011581860221173588

Two things:

relerr = 8.92613733644216e-07

32-bit round-off error is about 1.e-7. You are summing over about
15,000 values of order one, so the round-off error could easily
accumulate to this level.

prob_small = 0.00011581860221173588

When you sum over 15,000 (independent) normally distributed values
(so each value has standard deviation one), the sum comes from a
gaussian distribution that has a standard deviation of about 124 (and
a mean of zero). You can calculate the probability of such a sum
having a magnitude smaller that 0.018 using the so-called error
function, and that probability is very small – about 1% of 1%.

Best.

K. Frank