No kernel image is available for execution on the device: what to do?

I know there are several other questions about this, but obviously none of them helped.
I have the following:

-NVIDIA 960M (cc=5.0)
-pytorch: 2.1.1
-Cuda: 11.8

However,

YOLOv5  2023-12-12 Python-3.9.5 torch-2.1.1+cu118 CUDA:0 (NVIDIA GeForce GTX 960M, 2048MiB)
024-05-22 11:42:51.0616616 [E:onnxruntime:, sequential_executor.cc:514 onnxruntime::ExecuteKernel] Non-zero status code returned while running QuickGelu node. Name:'QuickGelu' 
Status Message: CUDA error cudaErrorNoKernelImageForDevice:no kernel image is available for execution on the device
YoloV5 failed --> Exception raised: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running QuickGelu node. Name:'QuickGelu' Status Message: CUDA error cudaErrorNoKernelImageForDevice:no kernel image is available for execution on the device

I swear I have no idea on why it’s happening.
The torch version is stable and cuda is supported… what’s going on?

The error seems to be raised by ONNXRuntime so you might want to check their GPU architecture support.