Cuda not found in CMake when upgrading to libtorch1.11

Xavier31 · June 13, 2022, 8:51am

I have several libtorch versions installed (1.3, 1.5, 1.10) on Ubuntu 20.04 with which I have no problem building and running.
However when I tried to upgrade to libtorch, cmake threw the following errror when configuring:

Found CUDA: /usr/local/cuda (found version "11.5") 
The CUDA compiler identification is unknown
CMake Error at /snap/cmake/1088/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):
  Failed to detect a default CUDA architecture.  

  Compiler output:

Call Stack (most recent call first):
  /home/fbx/libs/libtorch1.11/share/cmake/Caffe2/public/cuda.cmake:41 (enable_language)
  /home/fbx/libs/libtorch1.11/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
  /home/fbx/libs/libtorch1.11/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:35 (find_package)

Configuring incomplete, errors occurred!
I installed the latest version of cmake (3.23.2) bu it did not fix the issue.
Any idea why this got broken in 1.11 ?
Thanks

ptrblck · June 14, 2022, 5:44am

In case you were using different cmake versions, make sure to clean the build as the build files might be dirty and interact in strange ways.
If this also doesn’t help, check the cmake error log and see if e.g. an incompatible C++ compiler was found for the currently used CUDA version.

Xavier31 · June 14, 2022, 10:37am

I usually remove the build directory and clean the CMake cache, yes.

for a different problem, I had to revert to CMake 3.16 and got a different error

Found CUDA: /usr/local/cuda (found version "11.5")
The CUDA compiler identification is NVIDIA 11.5.50
Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
Detecting CUDA compiler ABI info
Detecting CUDA compiler ABI info - done
Caffe2: CUDA detected: 11.5
Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
Caffe2: CUDA toolkit directory: /usr/local/cuda
Caffe2: Header version is: 11.5
Found CUDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so
Found cuDNN: v8.4.0 (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
/usr/local/cuda/lib64/libnvrtc.so shorthash is 6dcccaf3
CMake Error in cpp_export/build/CMakeFiles/CMakeTmp/CMakeLists.txt:
Target "cmTC_b3f90" requires the language dialect "CUDA17" (with compiler
extensions), but CMake does not know the compile flags to use to enable it.

the cmake error log shows an error in pthread, which is weird (?)

Performing C SOURCE FILE Test CMAKE_HAVE_LIBC_PTHREAD failed with the following output:
Change Dir: cpp_export/build/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/ninja cmTC_54de7 && [1/2] Building C object CMakeFiles/cmTC_54de7.dir/src.c.o
[2/2] Linking C executable cmTC_54de7
FAILED: cmTC_54de7 
: && /usr/bin/cc -DCMAKE_HAVE_LIBC_PTHREAD   CMakeFiles/cmTC_54de7.dir/src.c.o  -o cmTC_54de7   && :
/usr/bin/ld : CMakeFiles/cmTC_54de7.dir/src.c.o : dans la fonction « main » :
src.c:(.text+0x46) : référence indéfinie vers « pthread_create »
/usr/bin/ld : src.c:(.text+0x52) : référence indéfinie vers « pthread_detach »
/usr/bin/ld : src.c:(.text+0x63) : référence indéfinie vers « pthread_join »
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

dfalbel · June 14, 2022, 1:05pm

I had to use something like this at some point:

set_property(TARGET yourtarget PROPERTY CUDA_STANDARD 14)

as it seems that by default CMake will use CUDA_STANDARD = C++standard but that might not be supported depending on your CUDA Toolkit version.

Xavier31 · June 14, 2022, 3:17pm

I tried

set_property(TARGET ${PROJ_NAME} PROPERTY CUDA_STANDARD 14)

with different values for c++standard (14,17,20)
but did not work

CMake Error at /usr/share/cmake-3.16/Modules/CMakeDetermineCUDACompiler.cmake:25 (message):
  Could not find compiler set in environment variable CUDACXX:
  CMAKE_CUDA_COMPILER-NOTFOUND.
CMake Error: CMAKE_CUDA_COMPILER not set, after EnableLanguage

dfalbel · June 14, 2022, 4:11pm

I think the old CMake scripts from LibTorch were smarter at finding nvcc. You can set CUDACXX env var to the nvcc location to fix, or make sure nvcc is in the PATH.

FWIW I think this is mostly related to this PR:

github.com/pytorch/pytorch

Update CMake and use native CUDA language support

pytorch:gh/peterbell10/108/base ← pytorch:gh/peterbell10/108/head

opened 11:55PM - 29 Jul 21 UTC

peterbell10

+264 -161

Stack from [ghstack](https://github.com/ezyang/ghstack): * #62550 * __->__ #6244…5 PyTorch currently uses the old style of compiling CUDA in CMake which is just a bunch of scripts in `FindCUDA.cmake`. Newer versions support CUDA natively as a language just like C++ or C. Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) cc @malfet @seemethere Differential Revision: [D31503350](https://our.internmc.facebook.com/intern/diff/D31503350)

Xavier31 · June 14, 2022, 4:27pm

set(CMAKE_CUDA_COMPILER /usr/local/cuda-11.5/bin/nvcc)

did the trick. Thanks !