I am trying to train a network on my NVIDIA RTX 3070. I receive the following error:
NVIDIA GeForce RTX 3070 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA GeForce RTX 3070 GPU with PyTorch.
I am trying with the latest stable version of PyTorch that works with CUDA 11.1 (also tried with 10.2, but didn’t have any luck).
Could it be that NVIDIA 3070 works with CUDA 11.2 and higher, while PyTorch supports up to CUDA 11.1 for the moment?
That seems strange as there should be Ampere support and has been for a while now IIRC. If you can build the latest version of PyTorch you can specify TORCH_CUDA_ARCH_LIST="8.6" in your environment to force it to build with SM 8.6 support.
No, I don’t think so, as it doesn’t change the behavior of the binaries and is only used if you build PyTorch from source, which isn’t the case if I understand your workflow correctly.
The original error message is raised, if you’ve installed a pip wheel or conda binary, which doesn’t support your architecture. Based on the message:
it seems you were using a pip wheel with CUDA<=10.2.
How might I verify if the pip wheel used CUDA<=10.2? I wanted to learn how to check for this.
In fact, my problem is more nuanced. I can get the code to work when I run it on the python/ipython interpreters, but any time I try to debug with code or pycharm, I get the error message.
I have been on this for two weeks, trying and re-trying different combinations pytorch versions (and nvidia drivers/ cuda toolkit/libcudnn). I have checked many times the virtualenvironments I use and how I select them in code. I have tried everything I know except building from source, and have not been able to resolve this discrepancy on my system.
Environment Information: my setup tries to follow the Nvidia compatibility matrix: driver-470/toolkit 114/libcudnn8.2/pytorch1.9+cu111
PyTorch version: 1.9.0+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.8.10 (default, Jun 2 2021, 10:49:15) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.11.0-25-generic-x86_64-with-glibc2.29
Is CUDA available: True
CUDA runtime version: 11.4.100
GPU models and configuration:
GPU 0: NVIDIA GeForce RTX 3090
GPU 1: NVIDIA GeForce RTX 3090
Nvidia driver version: 470.57.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] torch==1.9.0+cu111 <---- Is this a problem? No cuda 114?
[conda] Could not collect
And sys.path looks good. Is there anything specific to IDEs that I may be missing on this?
Based on your cross-post I would also assume that you pycharm is using another env with a different PyTorch installation.
I would thus either create a new virtual env and reinstall PyTorch + pycharm there or make sure to uninstall all PyTorch installations in the current and base environment and reinstall it in the current env only.
I discovered my virtual environment had problems. When I tried to install packages to it, they would be installed globally not locally. There was corruption throughout… I deleted it and started from scratch.
I came up with the following steps as a guide for anyone who would like to have a type of cheatsheet to verify their installation:
Nvidia Driver and CUDA Toolkit
If already installed, examine your Nvidia GPU driver version
Learn its architecture
sudo lshw -C display
Learn your current Linux kernel
Look up the Nvidia Compatibility Matrix to determine the correct driver, toolkit, and libcudnn
NVIDIA GeForce RTX 3080 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
How would you check the CUDA version within the pip wheel? Currently the I have torch 1.9.1 and torchvision 0.10.1 in the virtual environment. When doing ‘nvidia-smi’ I get a cuda version of 11.4. Any pointers?
It depends a bit on your use case. The pip wheels and conda binaries ship with their own CUDA runtime and you will be able to run PyTorch code without using the local CUDA toolkit (as long as the right CUDA runtime is selected; e.g. for Ampere GPUs you have to use CUDA>=11).
However, if you want to build a custom CUDA extension, the local CUDA toolkit will be used and you should install a matching version to the runtime used in PyTorch.
Hi @Ilias_Giannakopoulos , I am facing the same issue with my RTX 3070 and I haven’t installed torch globally, I installed torch only inside a conda environment and I’m using ubuntu as OS. How can I add TORCH_CUDA_ARCH_LIST? Is it possible to add this command while installing with conda?