Any change of using CUDA 12.2?

hgfernan · July 18, 2023, 1:11am

1st of all, congratulations for your amazing work.

I would like to use CUDA support, but as I can see in Start Locally | PyTorch, the current PyTorch package demands CUDA 11.7 or 11.8

However, NVIDIA is distributing the binaries of version 12.2

Is the upgrade to CUDA 12.2 in the near term plans of PyTorch development team ?

Best regards,
Hilton

ptrblck · July 18, 2023, 1:13am

The nightly binaries ship with 11.8 and 12.1 already which would be the target for the next PyTorch 2.1.0 release. We will start bringing up 12.2 soon, but it would miss the 2.1.0 release. In the meantime you can build from source.

hgfernan · July 18, 2023, 1:37am

Thanks a lot, kind sir !

I will take a look at building PyTorch from sources. PyTorch is certainly a huge software in any sense of the word, but NVIDIA seems more complex to be built, as it has so many interactions with hardware.

bhack · July 27, 2023, 12:08pm

@ptrblck Are we going to have some issue with nightly with Torch compile for the Triton cuda version After build from source, Error occured "device kernel image is invalid" · Issue #1955 · openai/triton · GitHub?

ptrblck · July 27, 2023, 4:25pm

OpenAI/Triton should not depend on any PyTorch CUDA libs, so I would not expect to see issues. However, I’m not familiar enough with Triton issues and their roadmap.

bhack · July 27, 2023, 5:07pm

It is more about using nightly pytorch where the CUDA version of trition it seems harcorded:

github.com

openai/triton/blob/9e3e10c5edb4a062cf547ae73e6ebfb19aad7bdf/python/setup.py#L128-L129


      
          version = "12.2.91"
          url = f"https://conda.anaconda.org/nvidia/label/cuda-12.2.0/linux-64/cuda-nvcc-{version}-0.tar.bz2"

I am on the same commit as our CI:

How we are going to still release working (torch.compile) cuda 11.8 nightly wheels?

ptrblck · July 27, 2023, 5:52pm

It’s still unclear what kind of issues you are seeing. Triton should use its own ptxas, shouldn’t it?
Which dependencies would be conflicting and what could fail? Again, I’m not the code owner of Triton, but expect to see no Triton->PyTorch dependencies for CUDA.

bhack · July 27, 2023, 11:56pm

What I meant is that the official conda pytorch nightly CUDA 11.8 is going to trigger Triton device kernel image is invalid

stebix · July 29, 2023, 9:32pm

Yup, seeing this problem as well. I installed pytorch-nightly in a fresh conda environment, but any attempt at compilation fails due to

[2023-07-29 04:04:26,800] torch._dynamo.convert_frame: [WARNING]   File "/home/user/anaconda3/envs/torchnightly/lib/python3.11/site-packages/triton/compiler/compiler.py", line 589, in _init_handles
[2023-07-29 04:04:26,800] torch._dynamo.convert_frame: [WARNING]     mod, func, n_regs, n_spills = fn_load_binary(self.metadata["name"], self.asm[bin_path], self.shared, device)
[2023-07-29 04:04:26,800] torch._dynamo.convert_frame: [WARNING]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2023-07-29 04:04:26,800] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
[2023-07-29 04:04:26,800] torch._dynamo.convert_frame: [WARNING] RuntimeError: Triton Error [CUDA]: device kernel image is invalid

maxpain · August 11, 2023, 5:41pm

Any updates on this?

bhack · August 29, 2023, 2:42pm

There is a ticket Pytorch nighlty and openAI/triton cuda · Issue #106144 · pytorch/pytorch · GitHub