RuntimeError: The NVIDIA driver on your system is too old (found version 11040)

I have already a setup in a local server machine with the following installations:

  • Python 3.9
  • CUDA Version: 11.4 and with nvcc -version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

Pytorch version 2.6.0

This setup works for a couple of models but when I am trying to load microsoft/Phi-3-mini-4k-instruct from hugging face I receive the following message:

RuntimeError: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

How can I solve this issue and why I am receiving it? Why it work partially in some cases?

Assuming you’ve installed the PyTorch binaries with CUDA 12.x runtime dependencies, you would need to update your NVIDIA driver to >=525.60.13.

Since this is a server and not a personal machine, could that be done just for myself or needs to be handled by the system admin?

You would need root access to update drivers, so you might need to contact your admin if your user account does not have these rights.

What about this message:

Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver

Could I just solve my issue by updating PyTorch version?

Your are not compiling with a CUDA driver, so the message sounds wrong. Where did you see it?

CUDA12 was released 2 years ago and it seems your admins weren’t able to update the drivers in this time frame?
If so, you could install a PyTorch binary with CUDA 11.8 libs and check if it would work in your setup.

Tbh, I thought that I had already installed a PyTorch version for Cuda 11.8. How can I find the correct PyTorch version for my case?

Select the CUDA runtime version form the install matrix and copy/paste the install command into your terminal:

Ah yeah, this is what I have already done! Then, I guess there is an issue with my model and old CUDA versions!

Were you able to run any PyTorch binary on your servers before? If so, do you remember which PyTorch version and CUDA runtime libs were used? If not, you could try to use docker containers which support CUDA’s forward compatibility in case you are using Datacenter GPUs.

Actually, I realized that no code was really running in the GPU but automatically they were running in the CPU (there was a catch in case it couldn’t run in a GPU).

OK, are you able to compile and run any CUDA sample?

Apparently not! I always receive this message when I try to use Cuda devices instead of CPU!

If you see the same or similar error message after compiling any other CUDA application (not PyTorch) it seems your systems might have general issues communicating with the GPUs and you might want to ask your admins for help.

I mean nvidia-smi command returns the version of CUDA the same as nvcc -version. Is there another way to check CUDA besides PyTorch?

Yes, since you’ve installed a CUDA toolkit locally, you could git clone https://github.com/NVIDIA/cuda-samples, cd in e.g. Samples/0_Introduction/matrixMul, build the example via make, and execute the binary.