I have come back to a project that was building fine a few months ago. I had to reinstall NVidia drivers so I have completely reinstalled the drivers, cuda, cudnn and am now trying to rebuild my c++ project using the most recent nightly build.
I am using cuda-12-1, cudnn 8.9.2.26 on unbuntu 22.04. In the past I built exclusively with g++11.3.0. Building with it now results in the undefined references to cupti errors but also has internal compiler error: Segmentation fault
. I can start a separate thread about the seg faults.
When I build instead with clang 13.0.1 I do not see the segfaults but I see numerous linker undefined reference to cuptiXXX
symbols. The references are all from libkineto::CuptiActivityApi
methods.
After some experimentation I discovered that including all of the following stanza in the root CMakeLists.txt file
fixes the problem:
set(TorchDIR /usr/local/libtorch)
list(APPEND CMAKE_PREFIX_PATH ${TorchDIR})
find_package(Torch REQUIRED)
list(APPEND TORCH_LIBRARIES "/usr/local/cuda/lib64/libcupti.so")
My project has a subdirectory where I create a library that wraps all of the libtorch usage behind my own API. Putting the above stanza in the CMakeLists.txt for that subdirectory does NOT work.
I’m posting this here 1) in case anyone else has this problem and 2) to suggest that the libtorch team figure out the correct solution (e.g. including libcupti.so
in TORCH_LIBRARIES
in the first place).