i am trying to setup env with pytorch and torchsparse ( GitHub - mit-han-lab/torchsparse: [MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs. ) with cuda capability to run scenescript demo ( GitHub - facebookresearch/scenescript: Public code release associated with SceneScript. ),
i manage to setup the environment but getting following error:
AcceleratorError Traceback (most recent call last)
Cell In[6], line 1
----> 1 lang_seq = model_wrapper.run_inference(
2 point_cloud_obj.points,
3 nucleus_sampling_thresh=0.05, # 0.0 is argmax, 1.0 is random sampling
4 verbose=True,
5 )
File ~/miniconda3/envs/misc/lib/python3.12/site-packages/torch/utils/_contextlib.py:124, in context_decorator.<locals>.decorate_context(*args, **kwargs)
120 @functools.wraps(func)
121 def decorate_context(*args, **kwargs):
122 # pyrefly: ignore [bad-context-manager]
123 with ctx_factory():
--> 124 return func(*args, **kwargs)
File ~scenescript/src/networks/scenescript_model.py:238, in SceneScriptWrapper.run_inference(self, raw_point_cloud, nucleus_sampling_thresh, verbose)
236 # Encode the visual inputs
237 pc_sparse_tensor, pc_min = self.preprocess_point_cloud(raw_point_cloud)
--> 238 encoded_visual_input = self.model["encoder"](pc_sparse_tensor)
239 context = encoded_visual_input["context"]
240 context_mask = encoded_visual_input["context_mask"]
File ~/miniconda3/envs/misc/lib/python3.12/site-packages/torch/nn/modules/module.py:1776, in Module._wrapped_call_impl(self, *args, **kwargs)
1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
...
Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
some env info:
python --version
Python 3.12.12
- GCC:13.3
- NVCC:12.9
- PyTorch:2.10
- PyTorch CUDA:12.8
python -c “import torch; print(torch.version); print(torch.cuda.get_arch_list()); print(torch.randn(1).cuda())”
2.10.0+cu128
[‘sm_70’, ‘sm_75’, ‘sm_80’, ‘sm_86’, ‘sm_90’, ‘sm_100’, ‘sm_120’]
tensor([0.3935], device=‘cuda:0’)
tried another env:
- GCC:13.3
- PyTorch:2.12
- PyTorch CUDA:13.0
python -c "import torch; print(torch.__version__); print(torch.cuda.get_arch_list()); print(torch.randn(1).cuda())"
2.12.0.dev20260217+cu130
['sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120', 'compute_120']
tensor([-0.3518], device='cuda:0')
how can I solve the cudaErrorIllegalAddress issue? @ptrblck