Pytorch + torchsparse compatibility for rtx5090

i am trying to setup env with pytorch and torchsparse ( GitHub - mit-han-lab/torchsparse: [MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs. ) with cuda capability to run scenescript demo ( GitHub - facebookresearch/scenescript: Public code release associated with SceneScript. ),

i manage to setup the environment but getting following error:

AcceleratorError                          Traceback (most recent call last)
Cell In[6], line 1
----> 1 lang_seq = model_wrapper.run_inference(
      2     point_cloud_obj.points,
      3     nucleus_sampling_thresh=0.05,  # 0.0 is argmax, 1.0 is random sampling
      4     verbose=True,
      5 )

File ~/miniconda3/envs/misc/lib/python3.12/site-packages/torch/utils/_contextlib.py:124, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    120 @functools.wraps(func)
    121 def decorate_context(*args, **kwargs):
    122     # pyrefly: ignore [bad-context-manager]
    123     with ctx_factory():
--> 124         return func(*args, **kwargs)

File ~scenescript/src/networks/scenescript_model.py:238, in SceneScriptWrapper.run_inference(self, raw_point_cloud, nucleus_sampling_thresh, verbose)
    236 # Encode the visual inputs
    237 pc_sparse_tensor, pc_min = self.preprocess_point_cloud(raw_point_cloud)
--> 238 encoded_visual_input = self.model["encoder"](pc_sparse_tensor)
    239 context = encoded_visual_input["context"]
    240 context_mask = encoded_visual_input["context_mask"]

File ~/miniconda3/envs/misc/lib/python3.12/site-packages/torch/nn/modules/module.py:1776, in Module._wrapped_call_impl(self, *args, **kwargs)
   1774     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
...
Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

some env info:

python --version                                   
Python 3.12.12
- GCC:13.3
- NVCC:12.9
- PyTorch:2.10
- PyTorch CUDA:12.8
python -c “import torch; print(torch.version); print(torch.cuda.get_arch_list()); print(torch.randn(1).cuda())”

2.10.0+cu128
[‘sm_70’, ‘sm_75’, ‘sm_80’, ‘sm_86’, ‘sm_90’, ‘sm_100’, ‘sm_120’]
tensor([0.3935], device=‘cuda:0’)

tried another env:

- GCC:13.3
- PyTorch:2.12
- PyTorch CUDA:13.0

python -c "import torch; print(torch.__version__); print(torch.cuda.get_arch_list()); print(torch.randn(1).cuda())"

2.12.0.dev20260217+cu130
['sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120', 'compute_120']
tensor([-0.3518], device='cuda:0')

how can I solve the cudaErrorIllegalAddress issue? @ptrblck

Based on the output I assume a 3rd party library ships with custom kernels and does not support Blackwell GPUs (yet). Try to isolate the failing code and also use blocking launches as the error message suggests.

1 Like