I am deploying a neural network in my Ubuntu machine and when the weights are initialised, I get an error: “RuntimeError: CUDA error: no kernel image is available for execution on the device”.
In nvidia-smi, the CUDA version is 10.2.
I have 2 GPUs. One of them (K40c) is very old and requires a low version of torchvision (0.4.0) but I am not using it: I specify cuda:1 and make sure that 1 points to the newest GPU device (Titan V).
Going to https://pytorch.org/get-started/previous-versions/, I made sure I had installed torch and torchvision so that they were compatible with version 10.2:
- torch version: 1.5.0
- torchvision: 0.6.0
Still, I have this issue, in particular in line:
init.xavier_normal_(m.weight.data, gain=gain)
File “…/python3.6/site-packages/torch/nn/init.py”, line 282, in xavier_normal_
return no_grad_normal(tensor, 0., std)
File “…/python3.6/site-packages/torch/nn/init.py”, line 19, in no_grad_normal
return tensor.normal_(mean, std)
RuntimeError: CUDA error: no kernel image is available for execution on the device.
Is it possible that the old GPU, despite not being used, is still causing this problem?
Thank you.