Dividing two floats

ChauhanT · March 31, 2022, 1:43pm

Hi,

I can’t seem to understand how one controls the precision when dividing two floats. For instance,

x = torch.tensor(10e-3)
y = torch.tensor(1e-3)
qt = x/y
qt_int = int(x/y)
print(f"{qt}")
print(f"{qt_int}")

Gives 9.999999046325684 and 9. Is there a way to make sure that it evaluates to 10? Perhaps I will need to specify the tolerance somewhere? Any tips/suggestions/help in understanding this would be really awesome.

Thanks very much!

ptrblck · March 31, 2022, 11:36pm

No, there isn’t an option to increase the precision besides using a wider dtype (e.g. float64) which would yield an increased precision in the result.

You are currently experiencing the limited floating point precision as described here.

The input values are not exactly representable as seen here:

torch.set_printoptions(precision=20)

torch.tensor(1e-3)
# tensor(0.00100000004749745131)
torch.tensor(10e-3)
# tensor(0.00999999977648258209)

which explains the small expected error.

ChauhanT · April 1, 2022, 9:22am

Thank you. I have implemented a workaround using rounding-off. But I was wondering, how do more experienced programmers deal with this problem ?

A better description of the problem: I have a discrete time-series x[n] with a resolution of dt (i.e., consecutive measurements x[k] and x[k+1] are sampled time dt apart). I want to bin this time-series using a user-provided interval DT. The precision-issue arose when I tried to determine the bin-size in number of samples using torch.floor(DT/dt), and used DT = 10e-3 and dt=1e-3.

Any tips would be awesome. This binning will be fundamental to my code, so I want to make sure I’ve got it as well-implemented as possible.

Cheers

KFrank · April 1, 2022, 4:56pm

Hi Chauhan!

The best approach – if you can manage it – would be to work with
exact integers, rather than potentially-inexact floating-point numbers.
So, for example, express DT as an integer, in units of dt. If you need,
at some point, floating-point numbers, temporarily convert your integers
to floating-point numbers when you use them as such.

If your user insists on providing you DT as a floating-point number
(that is expected to be an integer multiple of dt), then use round()
(rather thanfloor() or int()) to obtain the integer. You can optionally
validate that the result is close enough to being an integer not to be
a mistake:

(torch.round (DT / dt) / (DT / dt) - 1).abs() < tol

for some suitably chosen tolerance, tol.

Best.

K. Frank

Andrei_Cristea · April 1, 2022, 5:04pm

Why use torch floats for this particular operation? Regular Python types work fine:

In [1]: DT = 10e-3

In [2]: dt = 1e-3

In [3]: DT/dt
Out[3]: 10.0

In [4]: int(DT/dt)
Out[4]: 10

Normally you should work with torch objects when it’s important for speed / efficiency sake, but this seems like a one-off operation that you can do once (or a relatively low number of times) and move on with your life without much drag to overall program performance.

Cheers,
Andrei

KFrank · April 1, 2022, 6:28pm

Hi Andrei!

Using python types doesn’t fix (or otherwise address) the issue of
round-off error. By default, pytorch uses single-precision (float32)
floating-point numbers, while python uses double-precision (float64).
This changes for which specific values this specific symptom shows
up. It doesn’t make the core problem go away – it just moves it around
from one place to another:

>>> import sys
>>> sys.version
'3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]'
>>> DT = 10e-17
>>> dt = 1e-17
>>> DT / dt
9.999999999999998
>>> int (DT / dt)
9

Best.

K. Frank

Andrei_Cristea · April 1, 2022, 6:51pm

Great point!

This is hacky, but you could always do something like this assuming you have some hard floor on the minimum resolution. The example below assumes that hard floor happens to be 1e-33.

x = 10e-33
y = 1e-33

(x * 1e+17) / (y * 1e+17)
Out: 10.0

int((x * 1e+17) / (y * 1e+17))
Out: 10

In practice, it’s hard to believe you’re not going to have some lower bound on the resolution interval. 1e-3 is easy to cross in many practical cases (as absurd as it is to make that statement in general rather than with reference to a particular domain, it is nonetheless plausible), 1e-33 less so.

KFrank · April 1, 2022, 8:55pm

Hi Andrei!

The issue is not whether there is some (small) floor to the resolution
interval, y (called dt in the posts above).

Instead, the question is whether an “integer” multiple of y, including
its round-off error, falls a little bit below or a little bit above (or exactly
on) the true integer multiple of y. You can fall a little below the true
integer multiple – and hence get the wrong result when converting
(truncating) to int – even for a “large” resolution interval such as 1e-1:

>>> import sys
>>> sys.version
'3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]'
>>> x = 7e-1
>>> y = 1e-1
>>> x / y
6.999999999999999
>>> int (x / y)
6
>>> (x * 1e+17) / (y * 1e+17)
6.999999999999999
>>> int ((x * 1e+17) / (y * 1e+17))
6

For the use case that I believe Chauhan envisions it should suffice to
round() to the nearest integer, rather than truncating to int().

Best.

K. Frank

Andrei_Cristea · April 1, 2022, 9:58pm

Wow!

0.7 / 0.1 == 7.0
Out: False

I guess they don’t make 'em like they used to…

ChauhanT · April 2, 2022, 8:03am

Hi,

Thank you for this very interesting discussion! My problem gets compounded by the constraints of the mapping. Ideally, I’d like to use a floor operation to have a conservative bin-size. But the rounding-off makes the problem tricky. Suppose the function is f(Dt,dt), I’d like:

Dt,dt = 10.0,1.0
f(Dt,dt) = 10

Dt,dt = 9.9,1.0
f(Dt,dt) = 9

If 10.0/1.0 evaluated to 10.0, I could accomplish this by a simple floor. But alas I have to resort to other tricks. I have come up with:

def compute_n_bins(Dt,dt,relTol=1e-3):
  qt = Dt/dt
  n_bins = torch.floor(qt)

  # [qt] can only approach 1 from the left
  if 1 - torch.frac(qt) < relTol:
    n_bins += 1

  return int(n_bins)

I think primitive float types for computational framework should implement integer division checks (python native floats do this? Maybe? #Noob). But I guess we work with what we got! Plus, such checks will probably add a speed penalty in cases where there is no need for one.

Please let me know if you think there is something blatantly silly happening in my compute_n_bins, or if you know of a simpler/more direct way of doing this. I recognise that torch.frac is probably an expensive operation, but I am hoping to run compute_n_bins less than 10 times in the whole computation.

Cheers!