Is there any known issues related to cuda version 10.0 with respect to cuda-cpp-extension for pytorch?
I am noticing a strange behavior in an own cpp-cuda extension that in cuda version 10.0, the kernel code is unable to modify the output variable’s values. (i.e., all the values are 0), but the same code works with cuda 9.0. To reproduce this behavior, I did the following:
I took the standard cpp-cuda-extension (LLTM) of pytorch tutorial and added a print statement at line 60 of benchmark.py as below:
print(new_h.sum() + new_C.sum())
Then I compiled the cuda extension in two configurations and tested it.
Configuration-1: gcc 6.4.0 + cuda 10.0 + pytorch 1.0.0.dev20181008
output printed:
tensor(0., device='cuda:0', grad_fn=<AddBackward0>)
Configuration-2: gcc 4.9.0 + cuda 9.0 + pytorch 1.0.0.dev20181008
output printed:
tensor(57.2220, device='cuda:0', grad_fn=<AddBackward0>)
This shows that there is some issue with either the cuda version or gcc. I am unable to point out which one it is. Can anyone confirm this behavior if you have cuda-10.0 and cuda-9.0?
Thanks!