I’m currently unable to utilize my GPU as I get this error any time I try to use CUDA functions:
UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:109.)
Funny thing is that I don’t recall having upgraded my drivers or changing anything related to pytorch in the last few months, it used to be working fine.
I’ve tried re-installing pytroch multiple times (through conda), testing out different cuda toolkit versions but it doesn’t seem to help. I’ve also tried re-booting.
Here’s the collect_env
output:
Collecting environment information...
/home/nick/miniconda3/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
PyTorch version: 1.8.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31
Python version: 3.8.10 | packaged by conda-forge | (default, May 11 2021, 07:01:05) [GCC 9.3.0] (64-bit runtime)
Python platform: Linux-5.11.0-25-generic-x86_64-with-glibc2.10
Is CUDA available: False
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce GTX 1070 Ti
Nvidia driver version: 450.119.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.21.0
[pip3] numpy-quaternion==2021.3.17.16.51.43
[pip3] pytorch-lightning==1.3.1
[pip3] pytorch-lightning-bolts==0.3.2
[pip3] torch==1.8.0
[pip3] torchmetrics==0.3.2
[pip3] torchvision==0.9.0
[conda] cudatoolkit 11.0.221 h6bb024c_0 nvidia
[conda] numpy 1.20.1 pypi_0 pypi
[conda] numpy-quaternion 2021.3.17.16.51.43 pypi_0 pypi
[conda] pytorch-lightning 1.3.1 pypi_0 pypi
[conda] pytorch-lightning-bolts 0.3.2 pypi_0 pypi
[conda] torch 1.8.0 pypi_0 pypi
[conda] torchmetrics 0.3.2 pypi_0 pypi
[conda] torchvision 0.9.0 pypi_0 pypi
Also, nvidia-smi
and nvtop
seem to be working fine, here’s the output:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.119.03 Driver Version: 450.119.03 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 107... Off | 00000000:08:00.0 On | N/A |
| 5% 49C P0 38W / 180W | 524MiB / 8113MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1135 G /usr/lib/xorg/Xorg 83MiB |
| 0 N/A N/A 1832 G /usr/lib/xorg/Xorg 212MiB |
| 0 N/A N/A 1970 G /usr/bin/gnome-shell 170MiB |
| 0 N/A N/A 2156 G ...wnloads/Telegram/Telegram 5MiB |
| 0 N/A N/A 2458 G .../debug.log --shared-files 32MiB |
+-----------------------------------------------------------------------------+
Thanks in advance for any help.