Should PyTorch trained model produce the exact same results on Windows 10 / Ubuntu?

ranshadmi · June 25, 2018, 11:34am

Hi all,

I’m using PyTorch 0.4 on my Windows 10 laptop - but deploying on Ubuntu Linux 16.04 (also PyTorch 0.4). Both OSs use CUDA 9.0 (V9.0.176). Yet I’m getting very (very) slightly different results. That actually produces some different classifications in my multi-class classification problem but I’m mainly bothered by the inference not being bit-exact.
Is this behavior something I should expect? can it be fixed?

Thanks,
Ran

albanD · June 25, 2018, 12:47pm

Hi,

I guess this is expected. Default cudnn algorithms are not deterministic.
pytorch cpu random should be the same across platforms, but I’m not sure cuda random will be. Similar for python/numpy random.
I’m afraid it’s going to be very time consuming to make it bit-exact, and you migh end up with a non negligeable slow down for not being able to use cudnn kernels and other fast algorithms that are not bitperferct.

Should PyTorch trained model produce the *exact* same results on Windows 10 / Ubuntu?

Should PyTorch trained model produce the exact same results on Windows 10 / Ubuntu?