How to check if torch uses cuDNN

So i just used packer to bake my own images for GCE and ran into the following situation.

Installed CUDA 9.0 and everything worked fine, I could train my models on the GPU.
Afte a while I noticed I forgot to install cuDNN, however it seems that pytorch does not complain about this. On an image with only CUDA installed, if I run

torch.backends.cudnn.version() I get 7102 and torch.backends.cudnn.enabled == True

When I did install cuDNN from https://developer.nvidia.com/cudnn, everything still worked fine, I still got the same outputs for the two command above, but I didn’t get significant speedups.

Does this mean if one installs only CUDA and PyTorch, cuDNN also gets magically installed? Or is there a way how to check if pytorch is really using the speedups promised from cuDNN?

Any advice? Thanks :slight_smile:

3 Likes

How did you install PyTorch?
The binaries are shipped with CUDA and cuDNN already.

14 Likes

I used a script like this, to install CUDA, cuDNN and Python and then used pipenv install torch to install PyTorch. The image was based on Google Clouds “ubuntu-1604-lts”. But even if I comment out the line that installs cuDNN nothing seems to change for my PyTorch installation?

# install CUDA
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda-9-0; then
  # The 16.04 installer works with 16.10.
  wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.0.176-1_amd64.deb
  sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
  sudo dpkg -i cuda-repo-ubuntu1604_9.0.176-1_amd64.deb
  sudo apt-get update
  sudo apt-get install cuda-9-0 -y
fi

# install cuDNN
sudo dpkg -i /tmp/libcudnn7_7.1.4.18-1+cuda9.0_amd64.deb


# install python
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install -y python3.6
sudo apt-get install -y python3-pip

# install pipenv
sudo pip3 install pipenv

Ok, I just found an answer by soumith on another thread:

“if you want to use pytorch with an NVIDIA GPU, all you need to do is install pytorch binaries and start using it. We ship with everything in-built (pytorch binaries include CUDA, CuDNN, NCCL, MKL, etc.).”

so that means the whole installing CUDA and cuDNN on Ubuntu shenanigans are actually not necessary at all?! That would also explain my confusion, why I get the same time to train no matter if I install cudnn or not

sorry for the confusion

5 Likes

Yes, you just need to install the NVIDIA drivers and the binaries will come with the other libs.
If you want to build from source, you would need to install CUDA, cuDNN etc.

8 Likes

If that is the case, why I encountered the the following error when importing torch:

libcudart.so.10.0: cannot open shared object file: No such file or directory

I installed pytorch1.0 binary with cuda10, and I already have cuda9.0 in my system.
if pytorch does ship with everything in built, why it can’t find something that comes with it?

Could you check your LD_LIBRARY_PATH to see if you have some libs linking against your own libcudart as described in this issue?

4 Likes