I have already a setup in a local server machine with the following installations:
Python 3.9
CUDA Version: 11.4 and with nvcc -version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
Pytorch version 2.6.0
This setup works for a couple of models but when I am trying to load microsoft/Phi-3-mini-4k-instruct from hugging face I receive the following message:
RuntimeError: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.
How can I solve this issue and why I am receiving it? Why it work partially in some cases?
Your are not compiling with a CUDA driver, so the message sounds wrong. Where did you see it?
CUDA12 was released 2 years ago and it seems your admins weren’t able to update the drivers in this time frame?
If so, you could install a PyTorch binary with CUDA 11.8 libs and check if it would work in your setup.
Were you able to run any PyTorch binary on your servers before? If so, do you remember which PyTorch version and CUDA runtime libs were used? If not, you could try to use docker containers which support CUDA’s forward compatibility in case you are using Datacenter GPUs.
Actually, I realized that no code was really running in the GPU but automatically they were running in the CPU (there was a catch in case it couldn’t run in a GPU).
If you see the same or similar error message after compiling any other CUDA application (not PyTorch) it seems your systems might have general issues communicating with the GPUs and you might want to ask your admins for help.
Yes, since you’ve installed a CUDA toolkit locally, you could git clone https://github.com/NVIDIA/cuda-samples, cd in e.g. Samples/0_Introduction/matrixMul, build the example via make, and execute the binary.