PyTorch 1.6 - "Tesla T4 with CUDA capability sm_75 is not compatible"

I installed PyTorch 1.6 with pip as follows.

pip install --no-cache-dir torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

However, PyTorch prints the following now.

Tesla T4 with CUDA capability sm_75 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.

How come the pre-built PyTorch binaries for 1.6 do not support the newest CUDA capability? Is this intentional? How can I fix my setup? Thanks!

2 Likes

Even I’m facing this issue with Quadro RTX 8000. The official recommendation is to build from source. But I don’t get it, why is not supported ?

@seemethere was sm_75 dropped accidentally, since the release notes only mention 6.1 and PTX for 3.7?
However, this PR seems to have also removed 7.5.

I also have the same warning on Colab Tesla T4 GPU. But it doesn’t seem to actually affect anything I can still use CUDA.

I’m able to use but not sure whether all CUDA functionalities work as intended because of this error. This is a strange behavior. Building Pytorch isn’t easy thing on my restricted server.

I think it’s there for cuda 10.2. (But my poor 1080 Ti is dropped.)

Facing the same issue here. Card is GeForce RTX 2080 Ti. What does this warning mean? Inside torch, torch.cuda.is_available() returns True and I can train with GPU. Do I need to and how can I fix it?

Thanks in advance.

/home/zbxs/miniconda3/lib/python3.7/site-packages/torch/cuda/__init__.py:125: UserWarning:
GeForce RTX 2080 Ti with CUDA capability sm_75 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the GeForce RTX 2080 Ti GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

Python=3.7.7
cudatoolkit 10.1.243 h6bb024c_0 defaults
Pytorch 1.6.0.

You are right. I missed the 7.5 argument for CUDA10.2

@bsun0802 as @tom mentioned, sm_75 is still available in the CUDA10.2 binary, so you might want to install this particular version.

What is the rationale behind removing sm_75 for the binaries that target CUDA 10.1? I am afraid that I cannot upgrade to CUDA 10.2 due to dependency and driver issues.

What are the effects of using the PyTorch binaries that do not support sm_75 on a graphics card that is supporting sm_75?

tagging @seemethere, we need to fix this bug and re-push the CUDA 10.1 binaries.

okay, I got an update. @seemethere is fixing the issue and re-uploading the cuda 10.1 binaries asap, like in ~3 hours or so

6 Likes

Awesome, thank you for taking care of this so quickly!

Just a heads-up for everyone following along, the issue has been fixed underneath the hood but the binaries have unfortunately not been update yet: https://anaconda.org/pytorch/pytorch/files

1 Like

The new binaries seem to be available now, at least for pip. Make sure to use --no-cache-dir to not use a locally cached binary when pip installing PyTorch again.

pip install --no-cache-dir torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
3 Likes

conda binaries shouldn’t have been broken, only the wheels were broken, and they are fixed now. thanks for patience :slight_smile:

1 Like

Thank you for chiming in! I can confirm, the new binaries showed up on pip a few hours or so ago. :+1:

My code (on Colab Tesla T4 GPU) seems to work despite the warning. What is the impact of this bug?

The sm_70 argument should add the sass and ptx to the binaries. The ptx would make the code executable for your T4 (compute capability 7.5), but might not yield the optimal performance.

To the best of my knowledge, there are no sm_70 ptx in pytorch-1.6 binaries, but sm_70 ISA is compatible with sm_75, although some kernels might not launch, if I’m understating this correctly (from 1. Turing Compatibility — Turing Compatibility Guide 12.3 documentation ):

The Turing architecture is based on Volta’s Instruction Set Architecture ISA 7.0, extending it with new instructions. As a consequence, any binary that runs on Volta will be able to run on Turing (forward compatibility), but a Turing binary will not be able to run on Volta. Please note that Volta kernels using more than 64KB of shared memory (via the explicit opt-in, see CUDA C++ Programming Guide) will not be able to launch on Turing, as they would exceed Turing’s shared memory capacity.

This removed the warnings for me. Tks

1 Like