Machine restarted and now PyTorch gives CUDA error

AndrewUlmer · June 10, 2022, 5:34pm

Hi, our machine restarted and now PyTorch is unable to work with CUDA, despite working before the restart. When I type torch.cuda.device_count() I get the following error:

/home/username/miniconda3/envs/tvae/lib/python3.8/site-packages/torch/cuda/__init__.py:80: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1635023442742/work/c10/cuda/CUDAFunctions.cpp:112.) return torch._C._cuda_getDeviceCount() > 0

Below is my PyTorch version, as well as the output of nvidia-smi and nvcc -V. Please let me know if you need any additional information.

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 10225 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 10579 G /usr/bin/gnome-shell 0MiB |
| 1 N/A N/A 10225 G /usr/lib/xorg/Xorg 101MiB |
| 1 N/A N/A 10579 G /usr/bin/gnome-shell 28MiB |
±----------------------------------------------------------------------------+`

Output of nvcc -V:
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243

AlphaBetaGamma96 · June 10, 2022, 7:14pm

Do you use a conda environment? If so you might just need to re-activate it?

Also, do you know what cuda version you used before restarting? nvcc states 10.1 but if you were using say cuda 11.X then you might need to respecify your PATH and LD_LIBRARY_PATH variables so PyTorch sees the right cuda install.

AndrewUlmer · June 15, 2022, 7:05pm

I actually restarted the machine and the issue resolved. Thanks for the help, though!