cuDNN built against wrong CUDA version (10.0 instead of 9.0) when building from source -> CUDNN_STATUS_NOT_INITIALIZED

I have trouble building PyTorch with CUDA 9.0 from source.

Everything works when I install PyTorch 1.1.0 from conda with CUDA 9.0 using:

conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=9.0 -c pytorch

Then I tried to upgrade to PyTorch 1.3.1 by building from source, since the release only has prebuild PyTorch with CUDA 9.2.
Now I get a CUDNN_STATUS_NOT_INITIALIZED error (see “To Reproduce”).
PyTorch environment shows me the correct Cuda version (CUDA used to build PyTorch: 9.0.176).
However, PyTorch tells me CuDNN 7.4.1 (built against CUDA 10.0), which is not what I want.

Related issues on CUDNN_STATUS_NOT_INITIALIZED did not help me.

So my question is: how can I specify a cuda target version for cudnn when building from source? Or is there anything else that I am missing?

To Reproduce

Steps to reproduce the behavior:

  1. Install PyTorch v1.3.1 branch from source
cd ~/path/to/pytorch
git checkout v1.3.1
git submodule sync
git submodule update --init --recursive

python clean
rm -rf ~/.nv #

conda create -n pytorch131 python=3.7
conda activate pytorch131
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing
conda install -c pytorch magma-cuda90

export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python install
  1. Run the following python script
import torch
from torch import nn

m = nn.Conv2d(8, 13, 3, stride=2).cuda()
input = torch.randn(5, 8, 20, 30, device="cuda")
output = m(input)
print("success", output.shape)
  1. Observe Error
Traceback (most recent call last):
  File "../", line 13, in <module>
    output = m(input)
  File "/home/ubuntu/miniconda/envs/pytorch131/lib/python3.7/site-packages/torch/nn/modules/", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/miniconda/envs/pytorch131/lib/python3.7/site-packages/torch/nn/modules/", line 345, in forward
    return self.conv2d_forward(input, self.weight)
  File "/home/ubuntu/miniconda/envs/pytorch131/lib/python3.7/site-packages/torch/nn/modules/", line 342, in conv2d_forward
    self.padding, self.dilation, self.groups)

Expected behavior


PyTorch version: 1.3.0a0+ee77ccb
Is debug build: No
CUDA used to build PyTorch: 9.0.176

OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
CMake version: version 3.14.0

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla V100-SXM2-16GB
GPU 1: Tesla V100-SXM2-16GB
GPU 2: Tesla V100-SXM2-16GB
GPU 3: Tesla V100-SXM2-16GB
GPU 4: Tesla V100-SXM2-16GB
GPU 5: Tesla V100-SXM2-16GB
GPU 6: Tesla V100-SXM2-16GB
GPU 7: Tesla V100-SXM2-16GB

Nvidia driver version: 396.44
cuDNN version: /usr/lib/x86_64-linux-gnu/

Versions of relevant libraries:
[pip] numpy==1.17.4
[pip] torch==1.3.0a0+ee77ccb
[conda] blas                      1.0                         mkl
[conda] magma-cuda90              2.5.0                         1    pytorch
[conda] mkl                       2019.4                      243
[conda] mkl-include               2019.4                      243
[conda] mkl-service               2.3.0            py37he904b0f_0
[conda] mkl_fft                   1.0.15           py37ha843d7b_0
[conda] mkl_random                1.1.0            py37hd6b4f25_0
[conda] torch                     1.3.0a0+ee77ccb          pypi_0    pypi

Additional context

I also printed the torch config using

import torch.__config__


PyTorch built with:
  - GCC 5.4
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201307 (a.k.a. OpenMP 4.0)
  - NNPACK is enabled
  - CUDA Runtime 9.0
  - NVCC architecture flags: -gencode;arch=compute_70,code=sm_70
  - CuDNN 7.4.1  (built against CUDA 10.0)
  - Magma 2.5.0
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math, FORCE_FALLBACK_CUDA_MPI=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

How did you install cudnn locally? Did you use a .deb file and made sure it’s for the right CUDA version?

CUDA and cudnn was installed by an admin.
I checked the debian installation, and now this CuDNN 7.4.1 (built against CUDA 10.0) message makes total sense to me:

dpkg -l | grep cudn
ii  libcudnn7                                                                 amd64        cuDNN runtime libraries
ii  libcudnn7-dev                                                             amd64        cuDNN development libraries and headers

I hadn’t checked that before because I assumed there can’t be a mismatch when PyTorch 1.1.0 binaries run on CUDA without any issues.
I installed an appropriate cudnn version and now it works, thanks a lot!

Out of curiousity: Do you know why the previous PyTorch 1.1.0 binary installation was working? Does the binary contain cudnn?

Yes, the binaries ship with CUDA, cudnn and other libraries, so that you just need the NVIDIA driver to get started.

1 Like