I need to use CUDA 11.8 with Pytorch and installed it via the command on the homepage
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
However, when I run
import torch
torch.version.cuda
All I get as output is 11.2. I have the following env variable set as well:
LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda-11.8/lib64/
I have no idea where it is pulling the 11.2 from. On running the following command, I noticed that the build seems to contain cuda112.
(base) sagemaker-user@default:~$ conda list -n base | grep torch
pytorch 2.0.0 cuda112py310he33e0d6_200 conda-forge
pytorch-cuda 11.8 h7e8668a_5 pytorch
pytorch-gpu 2.0.0 cuda112py310h9871d0b_200 conda-forge
pytorch-lightning 2.0.9 pyhd8ed1ab_0 conda-forge
pytorch-metric-learning 1.7.3 pyhd8ed1ab_0 conda-forge
pytorch-mutex 1.0 cuda pytorch
torchaudio 2.0.0 py310_cu118 pytorch
torchvision 0.15.2 cuda112py310h0801bf5_1 conda-forge
(base) sagemaker-user@default:~$ conda list -n base | grep cuda-toolkit
cuda-toolkit 11.8.0 0 nvidia
(base) sagemaker-user@default:~$ echo $CUDA_HOME
(base) sagemaker-user@default:~$ echo $CUDA_PATH
(base) sagemaker-user@default:~$
I would appreciate if someone can help me address this? If you’ve experienced this before, could you please help me on why this might be happening? Thanks!
Edit:
Tried uninstalling via pip:
pip uninstall torch
pip uninstall torchvision
conda remove pytorch-gpu
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
conda list | grep torch
and this is what I get now:
sagemaker-user@default:~$ conda list | grep torch
pytorch 2.0.0 cpu_mkl_py310h402c8e3_101 conda-forge
pytorch-cuda 11.8 h7e8668a_5 pytorch
pytorch-lightning 2.0.9 pyhd8ed1ab_0 conda-forge
pytorch-metric-learning 1.7.3 pyhd8ed1ab_0 conda-forge
pytorch-mutex 1.0 cuda pytorch
torchaudio 2.0.0 py310_cu118 pytorch
torchmetrics 1.0.3 pyhd8ed1ab_0 conda-forge
torchvision 0.15.2 cpu_py310hb9e6163_1 conda-forge
import torch
print(torch.cuda.is_available()) # False
torch.version.cuda # Prints nothing