RTX5080 win11 Torchvision::nms couldn’t run

I’m encountering an issue with torchvision::nms that has blocked my work for 2 days. Any help would be greatly appreciated!

System Configuration:

  • GPU: RTX 5080 Laptop GPU
  • OS: Windows 11
  • CUDA version: 12.8
  • PyTorch version: 2.6.0+cu128.nv
  • CUDA available: True
  • PyTorch CUDA version: 12.8
  • CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8

Error Message:

2025-07-25 21:15:19,545 - training_manager - ERROR - train failed: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMTIA, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastMPS, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\cpu\nms_kernel.cpp:112 [kernel]
Meta: registered at /dev/null:195 [kernel]
QuantizedCPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\quantized\cpu\qnms_kernel.cpp:124 [kernel]
BackendSelect: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:194 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:503 [backend fallback]
Functionalize: registered at C:\Users\user\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at C:\Users\user\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\Users\user\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\Users\user\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\Users\user\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:100 [backend fallback]
AutogradOther: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:63 [backend fallback]
AutogradCPU: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:67 [backend fallback]
AutogradCUDA: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:75 [backend fallback]
AutogradXLA: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:83 [backend fallback]
AutogradMPS: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:91 [backend fallback]
AutogradXPU: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:71 [backend fallback]
AutogradHPU: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:104 [backend fallback]
AutogradLazy: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:87 [backend fallback]
AutogradMTIA: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:79 [backend fallback]
AutogradMeta: registered at C:\Users\user\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:95 [backend fallback]
Tracer: registered at C:\Users\user\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:294 [backend fallback]
AutocastCPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:34 [kernel]
AutocastXPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:41 [kernel]
AutocastMPS: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\autocast_mode.cpp:209 [backend fallback]
AutocastCUDA: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:27 [kernel]
FuncTorchBatched: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at C:\Users\user\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at C:\Users\user\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:202 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\Users\user\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:499 [backend fallback]
PreDispatch: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:206 [backend fallback]
PythonDispatcher: registered at C:\Users\user\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:198 [backend fallback]

2025-07-25 21:15:19,548 - main - ERROR - train failed

Additional Information:

Previous successful setup:

  1. Used offline package: torch-2.6.0+cu128.nv-cp312-cp312-win_amd64.whl
  2. Used offline package: torchvision-0.20.0a0+cu128.nv-cp312-cp312-win_amd64.whl
  3. Installed with: pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

Current issues:

  1. After running pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128, torch and torchvision automatically updated to torch-2.7.0 and torchvision-0.22
  2. Cannot find a matching torchaudio version at https://download.pytorch.org/whl/torchaudio/ - the expected version file should be torchaudio-2.6.0+cu128-cp312-cp312-win_amd64.whl

I suspect there might be a version compatibility issue, but I’m not sure how to resolve it properly. Any guidance would be greatly appreciated!

I don’t know where this version comes from, but would recommend installing any PyTorch binary from our support matrix build with CUDA 12.8+.

  1. GPU: All CUDA versions are downloaded from the gateway. Versions 12.8 and 12.9 (the latest) have been tried.

  2. Torch (online installation)
    pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

  3. Offline driver
    Downloaded from w-e-w/torch-2.6.0-cu128.nv · Hugging Face
    Missing torchaudio-2.6.0+cu128-cp312-cp312-win_amd64.whl

It is now confirmed that the GPU can be used, but the issue with ‘torchvision::nms’ is causing system malfunction
– This is a blocking issue

Did you try to install the wheels we are building as mentioned in my previous post? Based on your update it seems you are still installing wheels from another mirror, that we did not build.

“When will the issue of ‘RTX5080 win11 Torchvision::nms couldn’t run’ be fixed, and in which version is it expected to be resolved?”

All of our PyTorch binaries built with CUDA 12.8 support Blackwell GPUs and I recommended installing these. So far it seems you are still using other builds which raise the error. Let me know if you had a chance to install and try our builds instead as they should work.