PyTorch 2.1 compatibility with CUDA 12.3

SuperSonnix71 · November 27, 2023, 6:02pm

llama fails running on the GPU.
Traced it to torch! Torch is using CUDA 12.1.Tried multiple different approaches where I removed 12.1 to make it use 12.3 downgraded the Nvidia driver. No joy! All help is appreciated. Running on a openSUSE tumbleweed.
PyTorch Version: 2.1.1
CUDA Version: 12.1
CUDA Available: False

| NVIDIA-SMI 545.29.06 | Driver Version: 545.29.06 | CUDA Version: 12.3 |

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0

export PATH=/usr/local/cuda-12.3/bin:$PATH

export LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib64:$LD_LIBRARY_PATH

python -c “import torch; print(‘PyTorch Version:’, torch.version, ‘\nCUDA Version:’, torch.version.cuda, ‘\nCUDA Available:’, torch.cuda.is_available())”

PyTorch Version: 2.1.1

CUDA Version: 12.1
CUDA Available: False

llama_new_context_with_model: compute buffer total size = 278.43 MiB
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
16:54:47.668 [INFO ] private_gpt.components.embedding.embedding_component - Initializing the embedding model in mode=local
16:54:48.561 [WARNING ] py.warnings - /home/anaconda3/envs/privategpt/lib/python3.11/site-packages/torch/cuda/init.py:611: UserWarning: Can’t initialize NVML
warnings.warn(“Can’t initialize NVML”)

ptrblck · November 27, 2023, 6:35pm

Check your NVIDIA driver and reinstall it if needed.

SuperSonnix71 · November 27, 2023, 7:10pm

Already did! tried with the following versions. No joy!
NVIDIA-Linux-x86_64-535.129.03.run
NVIDIA-Linux-x86_64-545.29.06.run

SuperSonnix71 · November 27, 2023, 8:30pm

The issue is now solved!

ptrblck · November 27, 2023, 8:44pm

Could you post your solution here?

banton · November 30, 2023, 11:59am

My friend, if you are prepared to receive help on a technical forum, please also be prepared to post the solution if you arrive to it yourself.

Aymen_ABID · January 23, 2024, 7:59pm

have you an idea to install it for.

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Cuda compilation tools, release 12.3, V12.3.107
Build cuda_12.3.r12.3/compiler.33567101_0

to use resolve:
import torch
print(torch.cuda.is_available())
=>False

ptrblck · January 23, 2024, 8:01pm

Follow these instructions and it should work.

Aymen_ABID · January 24, 2024, 7:15am

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/cu121
ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch

Trying

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 --no-cache-dir

?

ptrblck · January 24, 2024, 1:26pm

You asked your initial question in this topic which is specific to CUDA 12.3. If you really need this version for some reason you would need to build PyTorch from source as indicated in my previous answer.
Now you post a pip install command instead, which would install PyTorch binaries with CUDA dependencies.
Could you explain what you actually want?

Aymen_ABID · January 24, 2024, 2:33pm

I want to use the GPU NVIDIA via jupyter…

after installing
with (!pip install torch torchvision torchaudio -f https://download.pytorch.org/whl/cu123/torch_stable.html)
or
(!pip3 install torch torchvision torchaudio -f https://download.pytorch.org/whl/torch_stable.html)

it seem good, only if it is not for :

**Defaulting to user installation because normal site-packages is not writeable …
Requirement already satisfied…

But
Unfortunately, the response is the same.

import torch torch.cuda.is_available()

import torch

torch.cuda.is_available()

Out[1]:

False

ptrblck · January 24, 2024, 3:19pm

Uninstall the current PyTorch binary and just run a supported commend from the install matrix.

Aymen_ABID · January 24, 2024, 7:39pm

(base) C:\Users\aymen>conda uninstall pytorch
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are missing from the target environment:

pytorch

ptrblck · January 24, 2024, 8:06pm

How are you importing torch if nothing was installed?
In the end you should either uninstall all PyTorch binaries and install the current stable or nightly release (a simple pip install torch would be enough to install torch==2.1.2+cu121) or create a new and empty virtual environment and install PyTorch there.

Aymen_ABID · January 25, 2024, 2:32am

I have eraze all (conda )
install again same prbs
actually I have this one

The following packages will be UPDATED:

pytorch pkgs/main::pytorch-2.1.0-cpu_py311hd0~ → pytorch::pytorch-2.1.2-py3.11_cuda12.1_cudnn8_0
torchvision pkgs/main::torchvision-0.15.2-cpu_py3~ → pytorch::torchvision-0.16.2-py311_cu121

Proceed ([y]/n)? y

Downloading and Extracting Packages

CancelledError()
CancelledError()
CancelledError()
CondaError: Downloaded bytes did not match Content-Length
url: https://conda.anaconda.org/pytorch/win-64/pytorch-2.1.2-py3.11_cuda12.1_cudnn8_0.tar.bz2
target_path: C:\ProgramData\anaconda3\pkgs\pytorch-2.1.2-py3.11_cuda12.1_cudnn8_0.tar.bz2
Content-Length: 1339081254
downloaded bytes: 53705125

CancelledError()
CancelledError()

for
(base) C:\Users\aymen>conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

but

The following packages will be UPDATED:

pytorch pkgs/main::pytorch-2.1.0-cpu_py311hd0~ → pytorch::pytorch-2.1.2-py3.11_cuda11.8_cudnn8_0
torchvision pkgs/main::torchvision-0.15.2-cpu_py3~ → pytorch::torchvision-0.16.2-py311_cu118

Proceed ([y]/n)? y

Downloading and Extracting Packages

Preparing transaction: done
Executing transaction: done

is work!

Else

torch.cuda.is_available()

give FALSE unfortunately!