torch.trapz gives slightly different answer from coding naively

LintonVision · March 21, 2021, 8:04pm

If I need to integrate over 100,000 samples, I get a slightly different answer whether I code naively or using torch.trapz:

import torch  # Import PyTorch

samples = 100000
x = torch.linspace(-5, 5, samples)
y = x**2

integral = 0  # Initialise integral = 0
for i in range(0, samples-1):
    integral = integral + (0.5*(y[i]+y[i+1])*(x[i+1]-x[i]))

integral2 = torch.trapz(y,x)

print(integral-integral2)
print(torch.isclose(integral, integral2))

Does anyone know why this is, and how to fix it? Thanks!

tom · March 21, 2021, 8:45pm

One thing I always recommend when doing this sort of experiment is to check whether it is more likely to be numerical accuracy or an algorithmic difference.
So if you add the dtype=torch.double argument to the torch.linspace call, you’ll see that the difference goes down significantly (for me from 9e-4ish to 7e-14ish).
This points to the algorithms computing the same result but perhaps having a different order for the additions, which typically leads to some discrepancy due to numerical precision.

Best regards

Thomas

LintonVision · March 22, 2021, 9:50am

That’s awesome Thomas! Thanks so much for the reply. It’s a really helpful tip I didn’t know about, and really appreciate it