Torch.finfo() eps weird behavior

Hello, so I think showing the problematic behavior I’ve encountered first is best:

So I’ve been training Policy-gradient (Reinforcement Learning) methods with TanhDistributions. Tanh distributions built-in log_prob() method is very sensitive to the numeric value close to 1 & -1. (giving NaN for 1.0 & -1.0, which often crashes the entire program) for float32 this means that sometime I need to torch.clamp the input, with an epsilon. I saw this weird thing during debugging that.

According to the Type info documentation eps is supposed to be the smallest number where 1.0 + eps != 1.0, which then makes me believe that x * eps with x < 1 should result in a False for this equation. Yet this happen only first at 0.5, and somehow this only happens at 1.0 + 0.5*eps !=1.0 and somehow, not for 1.0 - 0.5*eps != 1.0.

Am I just misunderstanding what eps is supposed to be here?

Hi Dan!

This is a documentation bug. Quoting from pytorch’s 1.8.1 documentation:

eps     float     The smallest representable number
                  such that 1.0 + eps != 1.0.

This is sort of on the right track, but really isn’t correct.

Numpy’s (1.20) documentation gets it right:

eps : float

    The difference between 1.0 and the next smallest
    representable float larger than 1.0. For example,
    for 64-bit binary floats in the IEEE-754 standard,
    eps = 2**-52, approximately 2.22e-16.

What’s at issue in your numerical test is that neither 1.0 + 0.5 * eps
nor 1.0 + 0.9 * eps is exactly representable, so the floating-point
arithmetic has to decide whether to round up or down to a representable

In the first case, it rounds down to 1.0; in the second, up to 1.0 + eps.
(By definition, there are no representable numbers in between.)

Here’s an illustration that shows what is going on including the correctness
of numpy’s definition of eps:

>>> torch.__version__
>>> torch.tensor ([1.0]) + 0.5000001 * torch.finfo().eps == 1.0
>>> torch.tensor ([1.0]) + 0.5000000 * torch.finfo().eps == 1.0
>>> (torch.tensor ([1.0]) + 0.5000001 * torch.finfo().eps) - 1.0 == torch.finfo().eps
>>> (torch.tensor ([1.0]) + 0.5000000 * torch.finfo().eps) - 1.0 == torch.finfo().eps


K. Frank


Aaah now I get it. The numpy description was a bit more clear to me.

Thank you for the thorough answer KFrank!