nvidia-smi works, torch.backends.cudnn.enabled returns True, but torch.cuda.is_available() returns False. Reboot can’t do any help.
I don’t know what’s wrong?
CUDA path is as follows:
How did you installed torch?
Hi, Pip install torch.
I had a try to install torch from source, and get the same result
When installing from source， there shows not using cudnn, cuda，nccl.
From the info in the setup, you might want to set
CUDA_HOME=/path/to/your/cuda/install and similar for cudnn. nccl will be compiled from source if CUDA is detected properly, so you don’t need a local install (unless you have one and want to use it).
I use linux container, and cuda root is something different.
Now I add CUDA_HOME into environment variables, then conda install from source, and it can compile cuda correctly. But I still get torch.cuda.is_available false.
-- Building with NumPy bindings
-- Detected cuDNN at /home/ubuntu/cuda/lib64, /home/ubuntu/cuda/include
-- Detected CUDA at /home/ubuntu/cuda
-- Building NCCL library
-- Building with distributed package
-- Not using NNPACK
I tried to run cuda samples, it returned the following error. Hot reboot or cold reboot doesn’t work.
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL
I found that if I pip install torchvision, nvidia-smi will be unworkable.
If the cuda samples don’t run. Then the problem is with your cuda install.
I would advice in that case to cleanly remove all cuda install from the system. And reinstall them from scratch with nvidia drivers that correspond. Then make sure that the cuda samples work properly. Once these work, you can install pytorch.
I solved it after several attempts at reinstalling pytorch