you saved my life. thank you. Works with “GeForce RTX 5050” too!
Yes, basically old cuda is not working.
Upgrading PyTorch in the existing venv or system to a CUDA 12.8+ nightly for sm_120 supports does the job.
This are my versions tested.
python -c “import sys;import platform;print(sys.version);print(‘Platform:’,platform.platform())”
3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]python -c “import torch;print(torch.version, torch.version.cuda, torch.cuda.is_available());print(torch.cuda.get_device_name(0))”
2.9.0.dev20250905+cu128 12.8 True
NVIDIA GeForce RTX 5080
You can reinstall using the wheel specified for the nightly build. But I think the stable against Cuda 12.8 also works fine.
pip uninstall -y torch torchvision
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu128
Hello,
I recently bought an RTX 5060 Ti 16GB (Blackwell architecture) GPU. My system has:
-
NVIDIA Driver: 581.xx (latest, supports CUDA 13)
-
CUDA Toolkit: 13.x
-
Python: 3.10.x
-
Stable Diffusion (Automatic1111 & ComfyUI frontends)
When I run Stable Diffusion with PyTorch, I get the following error:
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,
so the stacktrace below might be incorrect.
It looks like PyTorch does not yet include kernels for the new SM version of Blackwell GPUs (possibly SM 12.x).
I tried:
-
Installing PyTorch 2.0.1 with CUDA 11.8 → same error
-
Installing PyTorch 2.1.x with CUDA 12.1 → same error
-
Installing latest nightly builds (cu121) → still the same error
-
Running with
--skip-torch-cuda-test
→ skips the check, but still falls back to CPU execution
So, my questions are:
-
When will official PyTorch builds include RTX 5060 Ti (Blackwell) / CUDA 13 support?
-
Is there a temporary workaround (nightly wheels, source build with CUDA 13) to enable GPU acceleration?
-
Do I need to compile PyTorch from source with CUDA 13 myself until official support lands?
Any guidance would be really helpful. Right now Stable Diffusion only runs on CPU, which is very slow compared to GPU.
Thanks!
You need to install any of our PyTorch binaries built with CUDA 12.8+ as described in this thread to execute code on your Blackwell GPU.
For people running Stable Diffusion and encountering the error:
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
while having a working PyTorch install, for example (copied from a mesaage above):
import torch
input_ids = torch.randint(0, 1000, (1, 10), device="cuda")
embedding = torch.nn.Embedding(1000, 64).cuda()
output = embedding(input_ids)
print(output)
The issue is mostly that the working PyTorch version is not installed in the Stable Diffusion WebUI virtual environment. You can verify this by running the snippet above using the Python inside the WebUI venv:
C:\Users...\stable-diffusion-webui\venv\Scripts\python.exe
If it fails there, you need to install the correct PyTorch version in that environment.
On Windows, the steps are:
cd C:\Users...\stable-diffusion-webui\venv\Scripts
.\activate (if PowerShell blocks this, run: Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass)
Then, inside the venv:
pip uninstall torch torchvision torchaudio -y
pip install --pre torch torchvision torchaudio --index-url
https://download.pytorch.org/whl/nightly/cu129
It worked for me, hope it helps others.