I have trouble building PyTorch with CUDA 9.0 from source.
Everything works when I install PyTorch 1.1.0 from conda with CUDA 9.0 using:
conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=9.0 -c pytorch
Then I tried to upgrade to PyTorch 1.3.1 by building from source, since the release only has prebuild PyTorch with CUDA 9.2.
Now I get a CUDNN_STATUS_NOT_INITIALIZED
error (see “To Reproduce”).
PyTorch environment shows me the correct Cuda version (CUDA used to build PyTorch: 9.0.176
).
However, PyTorch torch.__config__.show()
tells me CuDNN 7.4.1 (built against CUDA 10.0)
, which is not what I want.
Related issues on CUDNN_STATUS_NOT_INITIALIZED did not help me.
So my question is: how can I specify a cuda target version for cudnn when building from source? Or is there anything else that I am missing?
To Reproduce
Steps to reproduce the behavior:
- Install PyTorch v1.3.1 branch from source
cd ~/path/to/pytorch
git checkout v1.3.1
git submodule sync
git submodule update --init --recursive
python setup.py clean
rm -rf ~/.nv # https://github.com/pytorch/pytorch/issues/5942
conda create -n pytorch131 python=3.7
conda activate pytorch131
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing
conda install -c pytorch magma-cuda90
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python setup.py install
- Run the following python script
import torch
from torch import nn
print(torch.backends.cudnn.is_acceptable(torch.cuda.FloatTensor(1)))
m = nn.Conv2d(8, 13, 3, stride=2).cuda()
input = torch.randn(5, 8, 20, 30, device="cuda")
output = m(input)
print("success", output.shape)
- Observe Error
True
Traceback (most recent call last):
File "../test-cuda.py", line 13, in <module>
output = m(input)
File "/home/ubuntu/miniconda/envs/pytorch131/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/miniconda/envs/pytorch131/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "/home/ubuntu/miniconda/envs/pytorch131/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
Expected behavior
Environment
PyTorch version: 1.3.0a0+ee77ccb
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
CMake version: version 3.14.0
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla V100-SXM2-16GB
GPU 1: Tesla V100-SXM2-16GB
GPU 2: Tesla V100-SXM2-16GB
GPU 3: Tesla V100-SXM2-16GB
GPU 4: Tesla V100-SXM2-16GB
GPU 5: Tesla V100-SXM2-16GB
GPU 6: Tesla V100-SXM2-16GB
GPU 7: Tesla V100-SXM2-16GB
Nvidia driver version: 396.44
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.1
Versions of relevant libraries:
[pip] numpy==1.17.4
[pip] torch==1.3.0a0+ee77ccb
[conda] blas 1.0 mkl
[conda] magma-cuda90 2.5.0 1 pytorch
[conda] mkl 2019.4 243
[conda] mkl-include 2019.4 243
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] torch 1.3.0a0+ee77ccb pypi_0 pypi
Additional context
I also printed the torch config using
import torch.__config__
print(torch.__config__.show())
Output:
PyTorch built with:
- GCC 5.4
- Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
- OpenMP 201307 (a.k.a. OpenMP 4.0)
- NNPACK is enabled
- CUDA Runtime 9.0
- NVCC architecture flags: -gencode;arch=compute_70,code=sm_70
- CuDNN 7.4.1 (built against CUDA 10.0)
- Magma 2.5.0
- Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math, FORCE_FALLBACK_CUDA_MPI=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,