Yes, these are two separate things if we are talking about the NVIDIA Driver and the CUDA Toolkit. CUDA itself could mean both: the driver as well as the toolkit.
While the NVIDIA Driver is making sure your system can properly communicate with and use your GPU, the CUDA Toolkit ships with math libraries (cuBLAS, cuSOLVER, cuRAND, etc.) as well as the CUDA compiler toolchain (nvcc
, ptxas
, etc.). Installing a CUDA Toolkit locally allows you to build CUDA applications, such as PyTorch. You have certainly seen my comment on a few questions explaining that PyTorch binaries ship with their own CUDA runtime dependencies (i.e. the CUDA runtime, CUDA Math libs, etc.) and users only need to properly install an NVIDIA driver.
Now, if you download a CUDA Toolkit from e.g. here it will ship with the NVIDIA driver as well. During the installation you will have an option to install the Toolkit alone or the Driver as well. You can also download the NVIDIA driver separately, which will be of course a smaller package.
Yes, if you mean the CUDA Toolkit by “cuda”. I.e. if you want to mainly run PyTorch applications and don’t plan to ever source build any CUDA application or a PyTorch 3rd party library etc., you won’t need to install a CUDA toolkit and can depend on the CUDA runtime dependencies which will be installed during the PyTorch binary installation. While running pip install torch
you will see that nvidia-*
wheels are pulled into your environment allowing PyTorch to use these.
This is expected as an OS update would most likely need a driver re-installation as well.
This is the right approach. You can install the NVIDIA drivers from apt
or you could also use standalone installers from here. I would strongly advise not to mix these approaches. If you installed the drivers from apt
, stick to it for driver updates etc.
Great! As a quick smoke test, you could create a tensor on the GPU making sure PyTorch can also communicate with the device.
Yes, as mentioned before the CUDA Toolkit download ships with the NVIDIA driver as well and you have the option to select the installation of it. However, based on your description you have installed the NVIDIA Driver beforehand and the PyTorch binaries afterwards, which is fine and correct. Note that that PyTorch binaries do not ship with an NVIDIA Driver so you have to install it beforehand.
I assume you are referring to the “Driver Version: 550.120 CUDA Version: 12.4” output of nvidia-smi
. If so, then note that nvidia-smi
reports the driver version and the corresponding CUDA toolkit this driver ships with. It does not tell you that a full CUDA toolkit was properly installed (and you can have multiple CUDA Toolkits installed on your system). To check for a locally installed CUDA Toolkit run nvcc --version
or try to build any CUDA sample from source.
You are thus right in assuming PyTorch will use its own CUDA 12.6.3 runtime dependencies during the execution.
Also, you haven’t asked it but just to explain the compatibility a bit more: the NVIDIA Driver is compatible for all minor CUDA updates. I.e. to run any PyTorch binary with CUDA 12.x you need to install any NVIDIA Driver >= 525.60.13 as described in the Minor Version Compatibility Docs. Once the driver is installed you can simply pip install
any PyTorch binary (stable, nightly, etc.) you want.
NVIDIA Drivers are not compatible between CUDA major updates. I.e. if you are using a CUDA 11.x driver, you won’t be able to run applications compiled and linked against CUDA 12.x dependencies.
Let me know if you have any more questions!