I’m finding a theoretical idea behind the formula in torch.isclose()

∣input−other∣≤atol+rtol×∣other∣

Why they combine absolute tolerance and relative tolerance together in this formula? I understand |input -other| in here is absolute error, but why absolute error <= atol + rtol x |other|?

I think it is mostly a convenience thing originating from the old 10.0 times 0.1 is hardly ever 1.0.
I’d think about it in terms of “intuitively, close means absolute error <= atol, but for large numbers we need something extra”.
You can see that “are these equal up to numerical error” typically needs a relative tolerance as well with an experiment setting rtol to 0 like this (might not work every time you draw random numbers, but very often, I tried a thousand times and it was invariably like this):

In [1]: import torch
In [2]: a = torch.randn(1000, dtype=torch.float64)
In [3]: b = a.float().double()
In [4]: torch.isclose(a, b).all()
Out[4]: tensor(True)
In [5]: torch.isclose(a, b, rtol=0).all()
Out[5]: tensor(False)

What I was trying to show, I guess, was that floating point representation is inherently with relative accuracy. And then you want an absolute term for when you encounter actual zeros.

Do you have any official document to study about relative tolerance and absolute tolerance in this formula? It’s really hard to find a document out there. No one explain why choosing rtol=1e-05, atol=1e-08

I am confused about the point of rtol. Why not just use a single tolerance value tol, and if |a-b| < tol, then return False? Obviously, following the above equation, I could just do this manually by setting rtol to zero, thereby making everything symmetric. What is the point of the symmetry-breaking rtol facto

You generally want to use rtol: since the precision of numbers and calculations is very much finite, larger numbers will almost always be less precise than smaller ones, and the difference scales linearly (again, in general). The only time you use atol is for numbers that are so close to zero that rounding errors are liable to be larger than the number itself.

Another way to look at it is atol compares fixed decimal places, while rtol compares significant figures.

The defaults were probably chosen because those are typical values people care about in scientific/engineering applications