[Question] no kernel image available

hello
i have issue with using cuda on pytorch and i get the following error:

print(torch.randn(1).cuda()) / print(torch.rand(5, 3, device=torch.device(‘cuda’)))

RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

  • installed packages:

python-cuda-12.1.0-2 (*i wondered the issue might be this but i did downgrade cuda to 12.1 as well)
python-pytorch-cuda-2.1.0-1
torchvision-cuda-0.15.2-1

*this build does work on another user’s GPU

  • logs:

torch.cuda.is_available()
True
print(torch.cuda.get_arch_list())
[‘sm_52’, ‘sm_53’, ‘sm_60’, ‘sm_61’, ‘sm_62’, ‘sm_70’, ‘sm_72’, ‘sm_75’, ‘sm_80’, ‘sm_86’, ‘sm_89’, ‘sm_90’, ‘compute_90’]

  • specifications:

NVIDIA-SMI 545.29.02
Driver Version: 545.29.02
CUDA Version: 12.3
Linux version: 6.6.1-arch1-1, not using conda
GPU: Nvidia GTX960M, maxwell architecture (sm_50 OR sm_52 OR sm_53, but i think the 900 series are sm_52 according to here)

~ attempts

i downgraded cuda to 12.1 but as issued here*1, seems like cuda 12.3 is compatible and the python-cuda compatibility didn’t change anything.

*1 https:// discuss.pytorch .org /t/question-i-have-a-question-about-installing-pytorch/191829/6

i wanted to build from source but it got a bit confusing on whether i should add arguments and if so, what arguments.

i tried installing via pip but i kept getting cache error or something. may be able to use wget and install locally.

though i have no issue when running on device=torch.device(‘cpu’)

Your locally installed CUDA toolkit won’t be used if you install PyTorch binaries, since they ship with their own CUDA runtime dependencies as also explained in the linked post. You would only need a properly installed NVIDIA driver.
If your GPU is too old, try installing the PyTorch binaries shipping with CUDA 11.8 as they support down to sm_37.