I tried to install pointnet2 (a major architecture for point cloud data) and the installation went well without any error. However, when I try to import the operation via from pointnet2_ops import pointnet2_utils, I get this error: CUDA error: the provided PTX was compiled with an unsupported toolchain.
Based on some searching, the error is because the cuda version used by pytorch is newer than the driver version installed on the machine. But, that’s not in my case. Here is my setup:
Cuda version in the Ubuntu machine: 11.3
Cuda version when install pytorch: 11.1
Pytorch: version 1.9.0
Python: version 3.9
This error happens in GPU Nvidia A5000 (ampere arch). When I install with the exactly same setting as above, but in my machine with Nvidia 2080 Ti, everything goes well. No any error was encountered. Is there any direction how to solve this issue in my nvidia A5000 machine?
Detailed error message:
Traceback (most recent call last):
File "/home/aumam/dev/multimodal_distillation/model/model.py", line 457, in <module>
net = net.cuda()
File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 637, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 552, in _apply
param_applied = fn(param)
File "/home/aumam/anaconda3/envs/pytorch1.9_4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 637, in <lambda>
return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
It seems Ampere GPUs are not supported since TORCH_CUDA_ARCH_LIST is hardcoded to GPUs up to sm_75here.
In any case, you’ve already cross-posted the question so I would expect the authors to know more about the limitations.
I guess you are seeing this error from OpenAI/Triton? If so, then note that they ship ptxas from CUDA 12 and you might need to driver update as described here.
Hi @ptrblck , I have a trouble with this problem.
If I run the file init.py directly, my tensor can be transformed to CUDA, but if I run another script (and this script import and call the file init.py), it has the problem.
Can you help me for this problem ? Thanks for your help
I haven’t encountered the issue myself, so unsure what exactly is causing it.
Based on your config I guess it could be related to OAI/Triton shipping with ptxas from CUDA 12.x while your driver supports CUDA 11.x as described in this issue. You could try to apply the workaround of specifying TRITION_PTXAS_PATH pointing to a ptxas from CUDA 11.
Thanks for your reply, I solved this problem, the reason is that the nvidia-smi version and nvcc --version mismatch together, so when I built submodules, it cannot work to transform tensor to cuda.
Thanks for your help.