Pytorch version error

Can anyone please tell me what version of Pytorch is compatible with A100-PCIE-40GB with CUDA capability sm_80. I have tried installing different versions but keep getting the same error: “RuntimeError: CUDA error: no kernel image is available for execution on the device.CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1”
Even after passing CUDA_LAUNCH_BLOCKING=1, I get the same error :“RuntimeError: CUDA error: no kernel image is available for execution on the device.” Please help.

Every PyTorch build using CUDA>=11.0 is compatible with sm_80 and thus your A100.
The latest stable torch==2.0.1 release uses CUDA 11.7 and 11.8, and the current nightly builds use CUDA 11.8 and 12.1. All of them will work on your A100.

So is there any other reason due to which I might be getting this error? My script runs fine in V100.

It depends which PyTorch version you have installed and you could check it via python -m torch.utils.collect_env. E.g. if it’s an older PyTorch release with CUDA 10.2 this error would be expected and you would need to update.

These are the details:

PyTorch version: 1.10.1+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

Do I need to update my Pytorch version?

Yes, you would need to update the PyTorch binary and install one with CUDA 11, as the currently installed older one uses 10.2 as I’ve already guessed.