I’m trying to compile pytorch_scatter to run the Hunyuan3D node in ComfyUI on Windows, targeting my NVIDIA RTX 5070 (Blackwell, sm_120). Despite multiple attempts, I keep hitting a std ambiguous symbol error in compiled_autograd.h:1134 during nvcc compilation, and I’m stuck. I need to get this working ASAP and want to know if/when a compatible version of pytorch_scatter will be available for my setup, or if there’s a workaround to resolve this issue. Below are full details of my setup, goal, errors, and attempted solutions. Any help or suggestions would be greatly appreciated.
System Specifications
- OS: Windows 10
- GPUs:
- NVIDIA GeForce RTX 5070 (12GB VRAM, sm_120, Blackwell architecture)
- NVIDIA GeForce RTX 3060 Ti (8GB VRAM, sm_86, Ampere architecture)
- Driver: NVIDIA 580.97 (CUDA 13.0 support)
- CUDA Toolkit: 12.8 (verified via nvcc --version)
- Python: 3.10 (in E:\COMFY\ComfyUI\venv)
- PyTorch: Nightly 2.9.0.dev20250828+cu128
- Build Tools: Visual Studio 2022 Build Tools (MSVC 14.44.35207), Windows 10 SDK (10.0.26100.0)
- ComfyUI: Installed via .exe installer (not portable), using virtual environment at E:\COMFY\ComfyUI\venv
Goal
I’m trying to set up the Hunyuan3D node in ComfyUI, which requires pytorch_scatter, pytorch3d, and a custom rasterizer from the ComfyUI-Hunyuan3DWrapper repository. My primary GPU is the RTX 5070, and I need pytorch_scatter compiled with sm_120 support for CUDA 12.8. I also have an RTX 3060 Ti, so compatibility with sm_86 is a bonus but not critical. I want to avoid breaking other custom nodes (e.g., ComfyUI-3D-Pack) that rely on the current PyTorch/CUDA setup.
Error Details
When compiling pytorch_scatter from source (E:\COMFY\ComfyUI\custom_nodes\ComfyUI-3D-Pack\pytorch_scatter), I get the following error:
E:/COMFY/ComfyUI/venv/lib/site-packages/torch/include\torch/csrc/dynamo/compiled_autograd.h(1134): error C2872: 'std': ambiguous symbol
E:/COMFY/ComfyUI/venv/lib/site-packages/torch/include\c10/cuda/CUDAStream.h(261): note: could be 'std'
E:/COMFY/ComfyUI/venv/lib/site-packages/torch/include\torch/csrc/dynamo/compiled_autograd.h(1134): note: or 'std'
...
error: command 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcc.exe' failed with exit code 2
The error occurs during nvcc compilation of csrc\cuda\scatter_cuda.cu, pointing to a namespace conflict in PyTorch’s compiled_autograd.h at line 1134, which is in the IValuePacker template: } else if constexpr (::std::is_same_v<T, ::std::string>) {
I’ve tried multiple approaches to resolve this, based on suggestions from an AI assistant, but none have worked:
-
Modified setup.py:
-
Updated E:\COMFY\ComfyUI\custom_nodes\ComfyUI-3D-Pack\pytorch_scatter\setup.py to include:
extra_compile_args={ 'cxx': ['/std:c++17', '-D_GLIBCXX_USE_CXX11_ABI=0', '-O3'], 'nvcc': ['--std=c++17', '-D_GLIBCXX_USE_CXX11_ABI=0', '-gencode=arch=compute_120,code=sm_120', '--expt-relaxed-constexpr', '-O3'] } -
Set environment variables:
set CXXFLAGS=/std:c++17 -D_GLIBCXX_USE_CXX11_ABI=0 set CUDAHOSTCXX="C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64\cl.exe" set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8 -
Ran:
pip install . --force-reinstall -v --no-build-isolation -
Result: Same std error.
-
-
Patching compiled_autograd.h:
- Attempted to change std::vectorc10::TypePtr packed_types_; to ::std::vectorc10::TypePtr packed_types_; around line 1134, but the provided line 1134 was already using ::std. The error persists, suggesting other std references in the file or included headers.
-
Pre-Built Wheel:
-
Tried:
pip install torch_scatter --index-url https://download.pytorch.org/whl/nightly/cu128 -
No compatible wheel found for torch_scatter-2.1.2+cu128-cp310-cp310-win_amd64.whl.
-
-
Environment Verification:
-
Confirmed PyTorch setup:
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))"Output: 2.9.0.dev20250828+cu128, True, NVIDIA GeForce RTX 5070.
-
Verified sm_120 support:
python -c "import torch; print(torch.cuda.get_device_capability(0))"Output: (12, 0).
-
-
Build Dependencies:
-
Installed wheel, setuptools, ninja:
pip install wheel setuptools ninja -
Ensured Visual Studio Build Tools and CUDA 12.8 are correctly configured.
-
Questions
- Is there a known issue with PyTorch nightly 2.9.0.dev20250828+cu128 and CUDA 12.8 causing std namespace conflicts when compiling CUDA extensions for sm_120 on Windows?
- Are there pre-built pytorch_scatter wheels for CUDA 12.8, Python 3.10, and Windows that support sm_120? If not, when might they be available?
- Can you suggest a specific patch for compiled_autograd.h or other headers to resolve the std error?
- Should I downgrade to an older PyTorch nightly (e.g., 2.8.0.dev20250801+cu128) or try CUDA 13.0? I’m hesitant to use CUDA 13.0 due to potential conflicts with other ComfyUI nodes.
- Is WSL2 with Docker a reliable fallback for RTX 5070, and are there specific setup guides for Hunyuan3D?
Additional Context
- I’m using ComfyUI’s .exe installer (not portable), with the virtual environment at E:\COMFY\ComfyUI\venv.
- I’ve backed up venv and setup.py to avoid losing my setup.
- I need pytorch_scatter for the ComfyUI-Hunyuan3DWrapper node, which also requires pytorch3d and a custom rasterizer. I plan to download the bpt-8-16-500m.pt model from Hugging Face once pytorch_scatter is installed.
- My RTX 3060 Ti is secondary, but I’d prefer a solution that doesn’t break compatibility with sm_86.