cuDNN not found in manual build

When manually building PyTorch (v2.1.1), I tried using CUDNN_INCLUDE_DIR and CUDNN_LIB_DIR to set the path to the include and lib directory of cuDNN (which is mentioned in, e.g., How to build with cuDNN - #7 by EthanZhangYi).

However, even though CUDNN_INCLUDE_DIR contains cudnn.h, I get the following error message when calling python setup.py install.

FAILED: caffe2/CMakeFiles/cuda_cudnn_test.dir/__/aten/src/ATen/test/cuda_cudnn_test.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_C10D_GLOO -DUSE_C10D_NCCL -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE -D_FILE_OFFSET_BITS=64 -I/home/christoph/build/pytorch/build/aten/src -I/home/christoph/build/pytorch/aten/src -I/home/christoph/build/pytorch/build -I/home/christoph/build/pytorch -I/home/christoph/build/pytorch/cmake/../third_party/benchmark/include -I/home/christoph/build/pytorch/third_party/onnx -I/home/christoph/build/pytorch/build/third_party/onnx -I/home/christoph/build/pytorch/third_party/foxi -I/home/christoph/build/pytorch/build/third_party/foxi -I/home/christoph/build/pytorch/build/caffe2/aten/src -I/home/christoph/build/pytorch/aten/src/ATen/.. -I/home/christoph/build/pytorch/third_party/miniz-2.1.0 -I/home/christoph/build/pytorch/torch/csrc/api -I/home/christoph/build/pytorch/torch/csrc/api/include -I/home/christoph/build/pytorch/c10/.. -I/home/christoph/build/pytorch/c10/cuda/../.. -isystem /home/christoph/build/pytorch/build/third_party/gloo -isystem /home/christoph/build/pytorch/cmake/../third_party/gloo -isystem /home/christoph/build/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /home/christoph/build/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /home/christoph/build/pytorch/cmake/../third_party/googletest/googletest/include -isystem /home/christoph/build/pytorch/third_party/protobuf/src -isystem /home/christoph/miniconda3/envs/vpt/include -isystem /home/christoph/build/pytorch/third_party/gemmlowp -isystem /home/christoph/build/pytorch/third_party/neon2sse -isystem /home/christoph/build/pytorch/third_party/XNNPACK/include -isystem /home/christoph/build/pytorch/third_party/ittapi/include -isystem /home/christoph/build/pytorch/cmake/../third_party/eigen -isystem /run/host/usr/local/cuda-12.1/include -isystem /home/christoph/build/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -isystem /home/christoph/build/pytorch/third_party/ideep/include -isystem /home/christoph/build/pytorch/third_party/googletest/googletest/include -isystem /home/christoph/build/pytorch/third_party/googletest/googletest -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -std=gnu++17 -fPIE -DMKL_HAS_SBGEMM -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -DTH_HAVE_THREAD -MD -MT caffe2/CMakeFiles/cuda_cudnn_test.dir/__/aten/src/ATen/test/cuda_cudnn_test.cpp.o -MF caffe2/CMakeFiles/cuda_cudnn_test.dir/__/aten/src/ATen/test/cuda_cudnn_test.cpp.o.d -o caffe2/CMakeFiles/cuda_cudnn_test.dir/__/aten/src/ATen/test/cuda_cudnn_test.cpp.o -c /home/christoph/build/pytorch/aten/src/ATen/test/cuda_cudnn_test.cpp
In file included from /home/christoph/build/pytorch/aten/src/ATen/cudnn/Descriptors.h:8,
                 from /home/christoph/build/pytorch/aten/src/ATen/test/cuda_cudnn_test.cpp:5:
/home/christoph/build/pytorch/aten/src/ATen/cudnn/cudnn-wrapper.h:3:10: fatal error: cudnn.h: No such file or directory
    3 | #include <cudnn.h>
      |          ^~~~~~~~~
compilation terminated.

This would indicate that CUDNN_INCLUDE_DIR and CUDNN_LIB_DIR do not work as intended. Or am I doing something wrong here?

Most likely your dev environment is not correctly set up and you could test it by trying to build any cuDNN application locally.
I don’t know where your CUDA toolkit etc. is installed, but if you’ve used the default locations installing cuDNN should work via:

cp -a cudnn-linux-x86_64-VERSION-archive/include/* /usr/local/cuda/include/
cp -a cudnn-linux-x86_64-VERSION-archive/lib/* /usr/local/cuda/lib64/
ldconfig

You wouldn’t need to specify any cuDNN specific env variables afterwards as it’s now installed into the CUDA toolkit.

Thanks, I didn’t know it is now supposed to be copied into the CUDA directories, as I only ever used the Debian packages in the past.