No CUDA GPUs are available

NullPointer · July 7, 2021, 1:15am

Hi, I’m trying to run a project within a conda env. I have a rtx 3070ti installed in my machine and it seems that the initialization function is causing issues in the program.

Error:
File "sTrain.py", line 37, in <module> torch.cuda.set_device(gpuid) File "/home/user/miniconda3/envs/deeplearningenv/lib/python3.6/site-packages/torch/cuda/__init__.py", line 263, in set_device torch._C._cuda_setDevice(device) File "/home/user/miniconda3/envs/deeplearningenv/lib/python3.6/site-packages/torch/cuda/__init__.py", line 172, in _lazy_init torch._C._cuda_init() RuntimeError: No CUDA GPUs are available

Any guidance would be very helpful. I tried uninstalling cudatookit, pytorch, and torchvision and reinstalling with conda install pytorch torchvision cudatoolkit=10.1 but I get the same error. I also the GPU appears in the device manager menu.

I’m not sure if this will aid in finding a solution but, trying torch.cuda.is_available() prints false.

Thank you in advance.

ptrblck · July 7, 2021, 6:12am

The error points to a missing NVIDIA driver, so you might want to reinstall it.
While the PyTorch pip wheels and conda binaries ship with the CUDA runtime, an NVIDIA driver would still be needed to be able to execute workloads on the GPU.

NullPointer · July 7, 2021, 12:23pm

I’ve been trying that but I’m not having any luck. My card says it’s on cuda 11.4. Is it possible to downgrade to 11.1?

NullPointer · July 7, 2021, 2:37pm

It seems the version is correct here, but when I run nvidia-smi I see the cuda version is 11.4

ptrblck · July 7, 2021, 8:25pm

It seems you’ve installed a new driver while an older CUDA toolkit was still installed.
Try to compile CUDA examples and execute them to make sure your setup is working fine.
If that’s not the case, uninstall the driver and CUDA toolkit and reinstall it.

SM2023 · May 22, 2023, 8:18am

Hi @ptrblck , sorry I run my code for different times, now I change a parameter ( number of epoch) and wana to rerun it again. it gave m ethe error . I called the code as , why it happen suddenly? before it worked before! my system is linux.

UDA_VISIBLE_DEVICES=“1,2,3” python casesummary_resolution_GPT_Neo_GPU_V5-125M-Trainer_v22.py

in jupyter notebook it will be run with device=1 but if I call it as above it gave me he following error.
and the error is :

torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available

ptrblck · May 22, 2023, 3:59pm

If your system only has a single valid GPU, you are masking it via the CUDA_VISIBLE_DEVICES. If that’s not the case, check if any drivers etc. were updated which might have broken your setup.

SM2023 · May 23, 2023, 1:39am

@ptrblck Many thanks for your help and reply. Sorry, do you have any idea about my post here?

Help CUDA error: out of memory - PyTorch Live - PyTorch Forums

The model can’t be fine tuned properly by using ddp and it generate nothing for me. I would appreciate if you can have a look. many thanks.