Pytorch 1.6 does not detect CUDA (GPU) on Arch Linux

Wmog · October 27, 2020, 8:55am

Hi,

PyTorch 1.6 does not seem to detect CUDA.

I installed it with the following command:

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

Code example

When I want to test whether CUDA is available:

>>> torch.cuda.is_available()
False

System Info

python -m torch.utils.collect_env

returns

Collecting environment information...
PyTorch version: 1.6.0
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Arch Linux
GCC version: (GCC) 10.2.0
CMake version: Could not collect

Python version: 3.8
Is CUDA available: No
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce RTX 2080 Ti
Nvidia driver version: 455.28
cuDNN version: Probably one of the following:
/usr/lib/libcudnn.so.8.0.4
/usr/lib/libcudnn_adv_infer.so.8.0.4
/usr/lib/libcudnn_adv_train.so.8.0.4
/usr/lib/libcudnn_cnn_infer.so.8.0.4
/usr/lib/libcudnn_cnn_train.so.8.0.4
/usr/lib/libcudnn_ops_infer.so.8.0.4
/usr/lib/libcudnn_ops_train.so.8.0.4

Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.6.0
[pip3] torchtext==0.7.0
[pip3] torchvision==0.7.0
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               10.2.89              hfd86e86_1  
[conda] mkl                       2020.2                      256  
[conda] mkl-service               2.3.0            py38he904b0f_0  
[conda] mkl_fft                   1.2.0            py38h23d657b_0  
[conda] mkl_random                1.1.1            py38h0573a6f_0  
[conda] numpy                     1.19.2           py38h54aff64_0  
[conda] numpy-base                1.19.2           py38hfa32c7d_0  
[conda] pytorch                   1.6.0           py3.8_cuda10.2.89_cudnn7.6.5_0    pytorch
[conda] torchtext                 0.7.0                    pypi_0    pypi
[conda] torchvision               0.7.0                py38_cu102    pytorch

My specifications

OS: Arch Linux
PyTorch version: 1.6
Python version: 3.8
CUDA/cuDNN version: 11.1 and 10.2 (tested with both)
GPU models and configuration: GeForce RTX 2080 Ti

Thank you in advance for your help.

appleparan · October 27, 2020, 9:02am

Check if CUDA is properly installed by nvidia-smi command

Toby · October 27, 2020, 9:02am

Can you try it with pip install: pip install torch torchvision, take from this link

Wmog · October 27, 2020, 10:10am

@appleparan Of course I had already checked whether CUDA is properly installed, here is the output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.28       Driver Version: 455.28       CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:03:00.0  On |                  N/A |
|  0%   43C    P8    12W / 260W |    185MiB / 11016MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+`

...

@Toby Same problem with PyTorch installed with pip install torch torchvision.

xecal1111 · October 27, 2020, 10:12am

Since you are an arch user, are you using Linux kernel 5.9? Linux 5.9 has problems with CUDA. If so, try downgrading to Linux 5.8.

Wmog · October 27, 2020, 10:27am

Yes I use the Linux Kernel 5.9. Is there any work-around?

xecal1111 · October 27, 2020, 10:29am

As far as I know, there is no work-around instead of downgrading to Linux Kernel 5.8 (or lower) and replacing nvidia driver (nvidia) to nvidia-dkms.

Wmog · October 27, 2020, 1:46pm

Thank you. Effectively, I downgraded to the Linux Kernel 5.8.5 from Arch Linux Archive, I removed nvidia, then I first installed linux-headers-5.8.5, then nvidia-dkms; it is worth noting that linux-headers-5.8.5 must imperatively be installed before nvidia-dkms, otherwise the kernel headers of the Linux kernel 5.8.5 will be missing for nvidia-dkms.

Now it works and PyTorch detects CUDA.

xecal1111 · October 27, 2020, 1:49pm

Great that you fixed it!