It works on my setup using the torchvision
binaries with the CUDA runtime and fails with the CPU-only package:
import torch
import torchvision
device = 'cuda'
boxes = torch.tensor([[0., 1., 2., 3.]]).to(device)
scores = torch.randn(1).to(device)
iou_thresholds = 0.5
print(torchvision.ops.nms(boxes, scores, iou_thresholds))
With CUDA:
pip install torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
python tmp.py
>tensor([0], device='cuda:0')
Without CUDA:
pip uninstall torchvision
pip install torchvision==0.10.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
python tmp.py
Traceback (most recent call last):
File "tmp.py", line 10, in <module>
print(torchvision.ops.nms(boxes, scores, iou_thresholds))
File "/opt/conda/envs/tmp/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 35, in nms
return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, QuantizedCPU, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, Tracer, Autocast, Batched, VmapMode].