RTX5080 win11 Torchvision::nms couldn’t run

xiangxingchina · July 25, 2025, 1:59pm

I’m encountering an issue with torchvision::nms that has blocked my work for 2 days. Any help would be greatly appreciated!

System Configuration:

GPU: RTX 5080 Laptop GPU
OS: Windows 11
CUDA version: 12.8
PyTorch version: 2.6.0+cu128.nv
CUDA available: True
PyTorch CUDA version: 12.8
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8

Error Message:

2025-07-25 21:15:19,545 - training_manager - ERROR - train failed: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMTIA, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastMPS, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\cpu\nms_kernel.cpp:112 [kernel]
Meta: registered at /dev/null:195 [kernel]
QuantizedCPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\quantized\cpu\qnms_kernel.cpp:124 [kernel]
BackendSelect: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:194 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:503 [backend fallback]
Functionalize: registered at C:\Users\user\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\Users\user\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\Users\user\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\Users\user\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\Users\user\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:100 [backend fallback]
AutogradOther: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:63 [backend fallback]
AutogradCPU: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:67 [backend fallback]
AutogradCUDA: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:75 [backend fallback]
AutogradXLA: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:83 [backend fallback]
AutogradMPS: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:91 [backend fallback]
AutogradXPU: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:71 [backend fallback]
AutogradHPU: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:104 [backend fallback]
AutogradLazy: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:87 [backend fallback]
AutogradMTIA: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:79 [backend fallback]
AutogradMeta: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:95 [backend fallback]
Tracer: registered at C:\Users\user\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:294 [backend fallback]
AutocastCPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:34 [kernel]
AutocastXPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:41 [kernel]
AutocastMPS: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastCUDA: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:27 [kernel]
FuncTorchBatched: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\Users\user\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:202 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:499 [backend fallback]
PreDispatch: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:206 [backend fallback]
PythonDispatcher: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:198 [backend fallback]

2025-07-25 21:15:19,548 - main - ERROR - train failed

Additional Information:

Previous successful setup:

Used offline package: torch-2.6.0+cu128.nv-cp312-cp312-win_amd64.whl
Used offline package: torchvision-0.20.0a0+cu128.nv-cp312-cp312-win_amd64.whl
Installed with: pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

Current issues:

After running pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128, torch and torchvision automatically updated to torch-2.7.0 and torchvision-0.22
Cannot find a matching torchaudio version at https://download.pytorch.org/whl/torchaudio/ - the expected version file should be torchaudio-2.6.0+cu128-cp312-cp312-win_amd64.whl

I suspect there might be a version compatibility issue, but I’m not sure how to resolve it properly. Any guidance would be greatly appreciated!

ptrblck · July 25, 2025, 2:21pm

I don’t know where this version comes from, but would recommend installing any PyTorch binary from our support matrix build with CUDA 12.8+.

xiangxingchina · July 25, 2025, 2:55pm

GPU: All CUDA versions are downloaded from the gateway. Versions 12.8 and 12.9 (the latest) have been tried.
Torch (online installation)
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
Offline driver
Downloaded from w-e-w/torch-2.6.0-cu128.nv · Hugging Face
Missing torchaudio-2.6.0+cu128-cp312-cp312-win_amd64.whl

It is now confirmed that the GPU can be used, but the issue with ‘torchvision::nms’ is causing system malfunction
– This is a blocking issue

ptrblck · July 25, 2025, 6:14pm

Did you try to install the wheels we are building as mentioned in my previous post? Based on your update it seems you are still installing wheels from another mirror, that we did not build.

xiangxingchina · July 26, 2025, 4:56am

“When will the issue of ‘RTX5080 win11 Torchvision::nms couldn’t run’ be fixed, and in which version is it expected to be resolved?”

ptrblck · July 26, 2025, 1:48pm

All of our PyTorch binaries built with CUDA 12.8 support Blackwell GPUs and I recommended installing these. So far it seems you are still using other builds which raise the error. Let me know if you had a chance to install and try our builds instead as they should work.