[NEED HELP] Trouble with CUDA capability sm_86

Thank you.

I discovered my virtual environment had problems. When I tried to install packages to it, they would be installed globally not locally. There was corruption throughout… I deleted it and started from scratch.

I came up with the following steps as a guide for anyone who would like to have a type of cheatsheet to verify their installation:

Nvidia Driver and CUDA Toolkit

  1. If already installed, examine your Nvidia GPU driver version

nvidia-smi

or

cat /proc/driver/nvidia/version

  1. Learn its architecture

sudo lshw -C display

  1. Learn your current Linux kernel

uname -a

  1. Look up the Nvidia Compatibility Matrix to determine the correct driver, toolkit, and libcudnn

Support Matrix - NVIDIA Docs

Support Matrix - NVIDIA Docs (gcc, glibc)

  1. Install Driver

sudo apt install nvidia-driver-XXX

  1. Install CUDA Toolkit

https://developer.nvidia.com/cuda-downloads

  1. Install libcudnnX (useful to do deep learning with cuda)

Installation Guide - NVIDIA Docs

sudo apt install libcudnnX

  1. Install pytorch
  • we will wait for this undtil you setup your virtualenv below.

Testing your system’s python setup

  1. First note, the location of the system-wide python interpreter

which python3

  1. Note the location of teh system-wide pip

which pip3

  1. What packages are there globally (this command will also list packages that were installed via apt-get install)

python3 -m pip list (or alternatively python3 -m pip freeze)

  1. Create virtualenv if not yet created

python3 -m venv name_for_your_env

  1. Usually, you will be asked to install the required files; normally the file “requirements.txt”. Examine it and become familiar with it. From within your virtual environment , install them via:

python3 -m pip install -r requirements.txt

  1. If not already installed, install pytorch.

You can get the pip3/conda command from here. Most people recommend conda/docker installs. We are doing pip3 to have more flexiblity with the packages we need with different repos.

  1. Note that if a package is properly installed, it should appear in your virtual_env/lib/pythonX.X/site-packages forlder.

  2. Additionally, ensure your pythonpath is properly set (learn more about pythonpath/imports/sys.path here: The Definitive Guide to Python import Statements | Chris Yeh)

  • pythonpath is a environment variable that contains paths to load python modules/scripts that are not binaries (i.e. located.

  • The pythonpath env variable is set in the .bashrc file found in your user folder (the user folder is located at ~/ and the “.” means it is a hidden file). Use your favorite editor to open it:

emacs ~/.bashrc

  • Look to see if you already set any pythonpath’s.

export PYTHONPATH=$PYTHONPATH:/new/path1/goes/here:/new/path2/goes/here:

Sanity Checks for torch/gpu

  1. In your virtualenv, open a python interpreter:

python3 (or even better ipython3 – you will need to install first pip3 install ipython).

  1. Check the system path from which modules are loaded

import sys

sys.path (should not see undesired paths here).

  1. Import torch

impor torch

  1. Double check that this torch module is located inside your virtual environment

import imp

imp.find_module(‘torch’) → should return a path in your virtualenv

  1. Check the version of your torch module and cuda

torch.version

torch.version.cuda

  1. Check the supported architectures

torch.cuda.get_arch_list()

  1. Check for the number of gpu detected

torch.cuda.device_count()

  1. Can you read the device?

device=torch.device(‘cuda:0’) # 0 by default, if you have more gpu’s increase your index.

1 Like