Pytorch 1.6.0: Wrong true_divide on cuda

thongnt · August 7, 2020, 3:24pm

Hello,

I just used new true_divide method on pytorch 1.6.0.
It possibly has a bug which took me quite a lot of time to trace back.

This code torch.tensor([435]).true_divide(15).ceil().long() gives 29 (correct result), while its cuda version torch.tensor([435]).to("cuda:1").true_divide(15).ceil().long() gives 30.

It might be something related to precision of math operators on cuda.
Do you guys know how to fix it?
Thank you.

ptrblck · August 10, 2020, 7:29am

The error is not created by the true_divide operation, but by the ceil operation, since the CUDA result is not exactly 29, but contains a small eps value of ~1e-6. This eps might be created, if the division is executed as a multiplication using the reciprocal.