The nightly binary with CUDA 12.8

Your CUDA toolkit raises this error as it’s too old as Blackwell architectures were added in 12.8. Besides that the PyTorch submodule is also too old as PRs enabling Blackwell landed right when 12.8 was releases (Jan/Feb 2025).