Thank you.
I discovered my virtual environment had problems. When I tried to install packages to it, they would be installed globally not locally. There was corruption throughout… I deleted it and started from scratch.
I came up with the following steps as a guide for anyone who would like to have a type of cheatsheet to verify their installation:
Nvidia Driver and CUDA Toolkit
- If already installed, examine your Nvidia GPU driver version
nvidia-smi
or
cat /proc/driver/nvidia/version
- Learn its architecture
sudo lshw -C display
- Learn your current Linux kernel
uname -a
- Look up the Nvidia Compatibility Matrix to determine the correct driver, toolkit, and libcudnn
Support Matrix - NVIDIA Docs (gcc, glibc)
- Install Driver
sudo apt install nvidia-driver-XXX
- Install CUDA Toolkit
https://developer.nvidia.com/cuda-downloads
- Install libcudnnX (useful to do deep learning with cuda)
Installation Guide - NVIDIA Docs
sudo apt install libcudnnX
- Install pytorch
- we will wait for this undtil you setup your virtualenv below.
Testing your system’s python setup
- First note, the location of the system-wide python interpreter
which python3
- Note the location of teh system-wide pip
which pip3
- What packages are there globally (this command will also list packages that were installed via apt-get install)
python3 -m pip list (or alternatively python3 -m pip freeze)
- Create virtualenv if not yet created
python3 -m venv name_for_your_env
- Usually, you will be asked to install the required files; normally the file “requirements.txt”. Examine it and become familiar with it. From within your virtual environment , install them via:
python3 -m pip install -r requirements.txt
- If not already installed, install pytorch.
You can get the pip3/conda command from here. Most people recommend conda/docker installs. We are doing pip3 to have more flexiblity with the packages we need with different repos.
- go to https://pytorch.org/ (choose your config). You may get a command like:
Note that if a package is properly installed, it should appear in your virtual_env/lib/pythonX.X/site-packages forlder.
Additionally, ensure your pythonpath is properly set (learn more about pythonpath/imports/sys.path here: The Definitive Guide to Python import Statements | Chris Yeh)
pythonpath is a environment variable that contains paths to load python modules/scripts that are not binaries (i.e. located.
The pythonpath env variable is set in the .bashrc file found in your user folder (the user folder is located at ~/ and the “.” means it is a hidden file). Use your favorite editor to open it:
emacs ~/.bashrc
- Look to see if you already set any pythonpath’s.
export PYTHONPATH=$PYTHONPATH:/new/path1/goes/here:/new/path2/goes/here:
Sanity Checks for torch/gpu
- In your virtualenv, open a python interpreter:
python3 (or even better ipython3 – you will need to install first pip3 install ipython).
- Check the system path from which modules are loaded
import sys
sys.path (should not see undesired paths here).
- Import torch
impor torch
- Double check that this torch module is located inside your virtual environment
import imp
imp.find_module(‘torch’) → should return a path in your virtualenv
- Check the version of your torch module and cuda
torch.version
torch.version.cuda
- Check the supported architectures
torch.cuda.get_arch_list()
- Check for the number of gpu detected
torch.cuda.device_count()
- Can you read the device?
device=torch.device(‘cuda:0’) # 0 by default, if you have more gpu’s increase your index.