Yolov5 model not loading if CUDA enabled

I am trying to get a Yolov5 model to run with CUDA in C++ using the LibTorch library. The model was converted to a torchscript model by using the python converter script. I was able to get the model to work in CPU mode without a problem. In all cases I am just preloading parameters not doing any training.

With CUDA enabled, however, I get this result:

Blockquote
[ObjectDetector()] torch::jit::load( yolov5s.torchscript.pt ); …
Could not run ‘aten::empty_strided’ with arguments from the ‘CUDA’ backend. ‘aten::empty_strided’ is only available for these backends: [CPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at aten\src\ATen\CPUType.cpp:2127 [kernel]
BackendSelect: registered at aten\src\ATen\BackendSelectRegister.cpp:761 [kernel]
Named: registered at …\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
AutogradOther: registered at …\torch\csrc\autograd\generated\VariableType_0.cpp:7974 [autograd kernel]
AutogradCPU: registered at …\torch\csrc\autograd\generated\VariableType_0.cpp:7974 [autograd kernel]
AutogradCUDA: registered at …\torch\csrc\autograd\generated\VariableType_0.cpp:7974 [autograd kernel]
AutogradXLA: registered at …\torch\csrc\autograd\generated\VariableType_0.cpp:7974 [autograd kernel]
AutogradPrivateUse1: registered at …\torch\csrc\autograd\generated\VariableType_0.cpp:7974 [autograd kernel]
AutogradPrivateUse2: registered at …\torch\csrc\autograd\generated\VariableType_0.cpp:7974 [autograd kernel]
AutogradPrivateUse3: registered at …\torch\csrc\autograd\generated\VariableType_0.cpp:7974 [autograd kernel]
Tracer: registered at …\torch\csrc\autograd\generated\TraceType_0.cpp:9341 [kernel]
Autocast: fallthrough registered at …\aten\src\ATen\autocast_mode.cpp:254 [backend fallback]
Batched: registered at …\aten\src\ATen\BatchingRegistrations.cpp:511 [backend fallback]
VmapMode: fallthrough registered at …\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]

Exception raised from reportError at …\aten\src\ATen\core\dispatch\OperatorEntry.cpp:363 (most recent call first):
00007FFE6357A7B200007FFE6357A750 c10.dll!c10::Error::Error [ @ ]
00007FFDF269D4E500007FFDF269D2F0 torch_cpu.dll!c10::impl::OperatorEntry::reportError [ @ ]
00007FFDF2A69C9700007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2A8E36A00007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2A8C07900007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2A69B9600007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2C3D3C900007FFDF2C3D1C0 torch_cpu.dll!at::empty_strided [ @ ]
00007FFDF3B4F19300007FFDF3B42EE0 torch_cpu.dll!torch::autograd::GraphRoot::apply [ @ ]
00007FFDF2A8C07900007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2A69B9600007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2A8E36A00007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2A8C07900007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2A69B9600007FFDF2A63920 torch_cpu.dll!at::native::mkldnn_sigmoid_ [ @ ]
00007FFDF2C3D3C900007FFDF2C3D1C0 torch_cpu.dll!at::empty_strided [ @ ]
00007FFDF28E8EFE00007FFDF28E8BD0 torch_cpu.dll!at::native::to_dense_backward [ @ ]
00007FFDF28E8AF500007FFDF28E89C0 torch_cpu.dll!at::native::to [ @ ]
00007FFDF2D3D3CF00007FFDF2C9B090 torch_cpu.dll!at::zeros_out [ @ ]
00007FFDF262086500007FFDF260AE70 torch_cpu.dll!at::BatchedTensorImpl::strides [ @ ]
00007FFDF2D53EDB00007FFDF2D46BB0 torch_cpu.dll!at::is_custom_op [ @ ]
00007FFDF2D8E61D00007FFDF2D8E4F0 torch_cpu.dll!at::Tensor::to [ @ ]
00007FFDF4153E4B00007FFDF4152980 torch_cpu.dll!torch::jit::Unpickler::readInstruction [ @ ]
00007FFDF4156BF000007FFDF4156B20 torch_cpu.dll!torch::jit::Unpickler::run [ @ ]
00007FFDF415145200007FFDF4151420 torch_cpu.dll!torch::jit::Unpickler::parse_ivalue [ @ ]
00007FFDF4125DD300007FFDF41259E0 torch_cpu.dll!torch::jit::readArchiveAndTensors [ @ ]
00007FFDF41259AF00007FFDF41240E0 torch_cpu.dll!torch::jit::load [ @ ]
00007FFDF412329400007FFDF41125A0 torch_cpu.dll!torch::jit::hasGradientInfoForSchema [ @ ]
00007FFDF412429300007FFDF41240E0 torch_cpu.dll!torch::jit::load [ @ ]
00007FFDF41240CB00007FFDF4124050 torch_cpu.dll!torch::jit::load [ @ ]
00007FF6E7FC8A5F00007FF6E7FC87E0 PytorchTestTwo.exe!main [C:\Leidos\SW\PytorchTestTwo\PytorchTestTwo.cpp @ 495]
00007FF6E7FD4B4400007FF6E7FD4A38 PytorchTestTwo.exe!__scrt_common_main_seh [d:\agent_work\4\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288]
00007FFEA509703400007FFEA5097020 KERNEL32.DLL!BaseThreadInitThunk [ @ ]
00007FFEA6CA265100007FFEA6CA2630 ntdll.dll!RtlUserThreadStart [ @ ]

I have tried different versions of libTorch and different CUDAs but things don’t seem to change. Is there any advice on what might be giving this error and how to fix it?

I also did get this model to run in Python using CUDA previously, though it was not exported to torchscript for that.