Is it required to set-up CUDA on PC before installing CUDA enabled pytorch?

What’s the status right now? Right now it seems that when I run pip install on a machine with GPU it installs the full CUDA version and otherwise it installs much smaller one without CUDA. It happens even in docker, which makes it difficult to ensure consistent builds.

Is there any flag or environment variable to force it?

That’s not correct as pip install torch will install the PyTorch binary with all needed CUDA libraries from PyPI. This workflow allows us to create a small PyTorch binary, which can be posted on PyPI, and use the CUDA libraries from PyPI which can also be shared with other Python packages.
Before this workflow was enabled all CUDA libraries were packaged into the large PyTorch wheel, which had a size of >1.5GB and was thus hosted on a custom mirror (not PyPI due to size limitations).

Yes, I was not clear. I noticed that at some point they were baked in into the torch package. By CUDA version I meant the version that has all those nvidia-* (nvidia-cuda-runtime-cu11 etc.) as a requirement. So is there a flag to either force or prevent this?

I guess, I can always just manually append them to my pip install, but I’m not sure I won’t miss something that way.

I’m still unsure what the issue is. The CUDA dependencies were always installed but were baked into the wheel before thus creating the large binary of >1.5GB.
If you don’t want to install the pip wheels with any CUDA dependencies you could install the CPU-only wheels as given in the install instruction.
This workflow is also the same as before and for the current release the command would be:

pip3 install torch torchvision torchaudio --index-url

on Linux systems.

I have a Dockerfile with a line:

RUN pip install torch==1.13.1 torchvision torchaudio

(I’ll try 2.0 later)

When my colleague with GPU, builds this image we get a proper image that we can use for training.
When I build this image on Mac I only get the base torch version, without those CUDA packages.

Check which packages are found and installed since Mac doesn’t support CUDA and would most likely install either the CPU-only binary or the MPS one (I’m not using Mac so don’t know how the package selection would work there).

Ok, I guess I see the problem now. I was confused because, in my case, pip always runs inside docker with regular Ubuntu 22.04. What I didn’t consider is that docker may automatically provide different ubuntu:22.04 depending on my architecture. Since I have M1 with an ARM chip, I got the one with aarch64. The torch installation makes a check for x86_64 when installing the dependencies.

So it all boils down to Intel vs M1 rather then GPU vs. no GPU.

At the end I solved it by using a flag at the beginning of my Dockerfile:

FROM --platform=linux/amd64 ubuntu:22.04

Obviously, it seems a bit slower than using the default ARM, but there is no other choice if you later want to deploy this image on GPU VMs.

Hi all, I am not sure if the conversation is still valid. I have one question:

  1. I installed PyTorch with:
    conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

  2. I tried this:
    import torch

It shows: NVIDIA GeForce RTX 3050

Does it mean I have the GPU ready to use, and lastly, do I now need Cuda to use the GPU?

  1. How can I download Cuda? Is there any other dependency or package that needs to be installed?
    It is because even though I see GPUs, when I run deep learning code, I don’t actually see the GPUs being used.Also, which Cuda toolkit do I download?

Yes, you should be able to use your GPU, as PyTorch is able to communicate with it already. Run a quick smoke test via x = torch.randn(1).cuda() and make sure print(x) is showing a CUDATensor.

No, as explained before:

I see the GPU is still not being used while I run the code. Is it something I am missing?

You might be running into the misleading output of Windows’ task manager as explained here and here. Use nvidia-smi or select the right task manager view.


I tried nvidia-smi in one terminal while running my code.
Below are the screenshot of what I found:
To my understanding GPU is not being used. Am I correct? If so, what can be the reason?

  • I got the below output when I run this: tensor([-0.5699], device=‘cuda:0’)

The binaries ship with the CUDA runtime for ease of use, as often users struggle to install the local CUDA toolkit with other libraries such as cuDNN and NCCL locally.
To use the GPU on your system in PyTorch you would thus only need to install the correct NVIDIA driver and one of the binary packages.

  • For this: How do I check if my machine has the correct NVIDIA driver and one of the binary packages.Can you just help me out please? I am quite new with this.

I also checked in the device manager: NVIDIA GeForce RTX 3050 Properties:
The below is the snapshot:
Does it mean I already have the drivers installed?

This means you can and are already using the GPU, which also means you have to have a properly installed NVIDIA driver.


If so, when I run nvidia-smi while i run the code, it returns me No running processes found. I have attached a snapshot above.

Is it possible?

Yes, this might be another Windows issue with restricted permissions for nvidia-smi disallowing it to read other process information.

Does this mean I don’t have to do anything with the GPU? Everything is installed, and I am already using it. Although I can’t see the usage on Windows.

Yes, your is already working as mentioned before. If you want to see a higher GPU utilization, write a smoke test rerunning large matmuls.

Sorry, do you mean running this command matmuls on the terminal while running the code?