[SOLVED]Error when initializing GPU

stealthman13 · May 11, 2018, 8:59pm

Whenever I try to initialize my GPU in PyTorch, I receive the following error:

torch.cuda.is_available()
True
torch.cuda.init()
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCTensorRandom.cu line=25 error=30 : unknown error
Traceback (most recent call last):
File “”, line 1, in
File “/home/local.jmatthews/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py”, line 143, in init
_lazy_init()
File “/home/local.jmatthews/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py”, line 161, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCTensorRandom.cu:25

I’ve restarted my server with no luck.

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

I’m running CentOS.

Any idea what’s going on?

GrumpyZhou · May 12, 2018, 9:40pm

Hi,
I got very similar error and I found the reason is I am using a GPU with wrong id. One easy way to check whether it is the problem, you can try to simply create a tensor on that GPU, e.g., a = torch.tensor([1., 2.], device=torch.device('cuda: id'). I solve this by using a correct GPU id.

stealthman13 · May 14, 2018, 3:07pm

Turns out that someone installed Cuda 9.1 on the system that I was working on. I had PyTorch for Cuda 8.0 installed. I installed PyTorch for Cuda 9.1 and it worked.