CUDA error: the provided PTX was compiled with an unsupported toolchain [in Nvidia Ampere]

Hi,

I tried to install pointnet2 (a major architecture for point cloud data) and the installation went well without any error. However, when I try to import the operation via from pointnet2_ops import pointnet2_utils, I get this error:
CUDA error: the provided PTX was compiled with an unsupported toolchain.

Based on some searching, the error is because the cuda version used by pytorch is newer than the driver version installed on the machine. But, that’s not in my case. Here is my setup:

  1. Cuda version in the Ubuntu machine: 11.3
  2. Cuda version when install pytorch: 11.1
  3. Pytorch: version 1.9.0
  4. Python: version 3.9

This error happens in GPU Nvidia A5000 (ampere arch). When I install with the exactly same setting as above, but in my machine with Nvidia 2080 Ti, everything goes well. No any error was encountered. Is there any direction how to solve this issue in my nvidia A5000 machine?

Detailed error message:

Traceback (most recent call last):
  File "/home/aumam/dev/multimodal_distillation/model/model.py", line 457, in <module>
    net = net.cuda()
  File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 637, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 530, in _apply
    module._apply(fn)
  File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 530, in _apply
    module._apply(fn)
  File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 530, in _apply
    module._apply(fn)
  File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 552, in _apply
    param_applied = fn(param)
  File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 637, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Many thanks

It seems Ampere GPUs are not supported since TORCH_CUDA_ARCH_LIST is hardcoded to GPUs up to sm_75 here.
In any case, you’ve already cross-posted the question so I would expect the authors to know more about the limitations.

Thanks much for pointing that out! I added “8.6” in the TORCH_CUDA_ARCH_LIST and it solved the issue.

Hey, I’m facing the same issue but I am using a RTX 4090 card. I installed the cuda toolkit version 12.0 from here.

I have pytorch2.0 with pytorch-cuda=11.8 and python 3.11.4

PS: I just got this card and I’m in the process of setting it up, please let me know if I should install any additional drivers.

I guess you are seeing this error from OpenAI/Triton? If so, then note that they ship ptxas from CUDA 12 and you might need to driver update as described here.

Hey, I got this error while trying to load a gguf model using ctranformers

llm = AutoModelForCausalLM.from_pretrained("TheBloke/Llama-2-7B-GGML", gpu_layers=50)

Do you know if your call uses torch.compile or anything from OpenAI/Triton under the hood?