Environment:
- Remote Linux with core version 5.8.0. I am not a super user.
- Python 3.8.6
- CUDA Version: 11.1
- GPU is RTX 3090 with driver version 455.23.05
- CPU: Intel Core i9-10900K
- PyTorch version: 1.8.0+cu111
- System imposed RAM quota: 4GB
- System imposed number of threads: 512198
- System imposed RLIMIT_NPROC value: 300
After I run the following code (immediately after I entered python3 command line, so nothing else run before):
import os
os.environ['OPENBLAS_NUM_THREADS'] = '2'
import torch
torch.cuda.is_available()
torch.cuda.device_count()
torch.cuda.device_count()
returns 0 and torch.cuda.is_available()
returns False
with some additional error messages
/usr/local/lib/python3.8/dist-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 2: out of memory (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
But I can run nvidia-smi
and nvcc
successfully
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0
which means the GPU hardware and CUDA are installed. Why can PyTorch not see the GPU? Is it possible that PyTorch is divided into GPU version and non-GPU version and the sys admin happened to install a non-GPU version? If so, how can I tell if the PyTorch installed is GPU or non-GPU version? Is there any requirements on CUDA version so that CUDA 11.1 installed does not get along well with PyTorch 1.8.0? If so, what version of PyTorch is CUDA 11.1 happy to work with? Thank you for help.