Failed to find nvToolsExt

Hello,

libtorch 2.0.1 with CUDA 11.8 and cudNN 8.5 works perfectly fine for me in Visual Studio Community 2022.

However, i now wanted to upgrade to libtorch Preview (nightly) with CUDA 12.1 and cudNN 8.9, but now i get an error message when trying to build with cmake:

[CMake] …
[CMake] – Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/include (found version “12.1.105”)
[CMake] CMake Error at C:/…/libtorch/share/cmake/Caffe2/public/cuda.cmake:74 (message):
[CMake] Failed to find nvToolsExt.

I saw in NVTX github that the most recent NVTX version (nvtx3) is using a header only file instead of a library.
Could that be the reason for the error i get?
Is this a known bug or am i doing something wrong?

Best Regards,
Max

1 Like

Hello Max,

The error message you are encountering indicates that CMake is unable to find the nvToolsExt library, which is typically provided by the NVIDIA CUDA Toolkit. This library is used for GPU profiling and debugging.

The issue could be related to the version mismatch between your CUDA installation and the libtorch Preview version you are trying to use. It’s possible that the libtorch Preview version you have is expecting a different version of nvToolsExt than the one provided by CUDA 12.1.

To address this issue, you can try the following steps:

  1. Confirm that the nvToolsExt library is indeed present in your CUDA 12.1 installation. Check the C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/lib64 directory for the library file (nvToolsExt64_1.lib on Windows).

  2. Ensure that the CUDA 12.1 installation directory is correctly set in your system’s environment variables (PATH variable) or in the CMake configuration. This ensures that CMake can locate the necessary CUDA files.

  3. If the library is present and the environment variables are correctly set, you can try manually adding the path to the nvToolsExt library in your CMakeLists.txt file. Add the following line before the find_package(Torch REQUIRED) line:

    link_directories("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/lib64")
    

    Replace the path with the correct location of the lib64 directory on your system.

  4. If the above steps do not resolve the issue, it’s possible that there is an incompatibility between the libtorch Preview version and CUDA 12.1. In that case, you might need to wait for a newer libtorch Preview version that is compatible with CUDA 12.1 or consider downgrading your CUDA installation to a version that is compatible with the libtorch Preview version you want to use.

I hope this helps! Let me know if you have any further questions.

1 Like

Thank you for the extensive answer. I am running into this problem too, but I cannot find a nvToolsExt lib anywhere under C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1. I ran the NVIDIA CUDA installer as specified on the download page and installed all possible packages.

How can I best install the nvToolsExt lib? It does not seem to come with the default installer

2 Likes

I had a similar problem and what I did was install NVTX with CUDA 11.8 installer. With CUDA 12 installed, run CUDA 11.8 installer and select custom installation. Then select Nsight NVTX alone and leave everything else unchecked.

image

4 Likes

@Te93 This worked for me! Now have PyTorch working on CUDA 12.2

Thanks :slight_smile:

1 Like

I’ve just sign into this forum to say thank you. I was about to create an issue on pytorch’s github saying that link libraries paths to cuda didn’t match the installation of Nvidia’s new 12.3 toolkit, but your direction to install NVTX from an older version of the toolkit saved the day. I think Nvidia should include NVTX as well on newer versions.

Thank you.

1 Like

Odd, the headers are still in 12.1
image

For anyone else wondering its “C:\Program Files\NVIDIA Corporation\NvToolsExt\lib\x64\nvToolsExt64_1.lib”
No idea if that is recent

Its not like this since the last time I looked at this since cuda 12 NVIDA dropped support of NVTX in favor of NVTX3 which can be found under …/CUDA/%version%/NvtoolsExt which isn’t NVTX but NVTX3. Pytorch tried to update/upgrade to NVTX3 from what I see in the github logs but it failed to go through all test and so they reverted the changes and nobody fixed it. What you have are maybe remnant of an old or other installation.

A bit a rant Pytroch is at the moment far from stable only the bugs which are with the big frameworks and libraries are getting fixed and there is so much stuff to get fixed it starts to be a drag and there is more coming python will come with real mulithreading a lot of stuff will break because Pytroch isn’t build around it and has many static functions e.g like seed() . It maybe take a bit longer but MS want to drop the console for powershell = old help scripts won’t work anymore also in not so far future the next VS version is coming.

UPDATE:

For anybody still encountering this issue, there is still no official solution quite yet as of 6/5/2024 except to just use CUDA 11.8 on Windows, or to install CUDA 11.8 (making sure that you specifically check the box to install Nsight NVTX) and copy the nvToolsExt64_1.lib from 11.8 to 21.x’s \lib\x64 directory.

There is already a GitHub issue for this, and the PyTorch developers already have a pull request addressing this bug and another pull request specifically addressing the NVTX changes. Currently they are working to merge their fix for this (looks like they are just reviewing changes before merging) which will stop Cmake from attempting to look for the library in CUDA 12.x+.

The reason this is happening is because Nvidia stopped including NVTX as a compiled library in CUDA 12.x+ with NVTX version 3, which is great if you’re a developer as you only have to include it’s header files now. Nvidia explains this here. Unfortunately, the LibTorch developers did not have time to account for this for cmake, which is still looking for the compiled library. That’s why you have to steal the .lib file from 11.8 in order to build on 12.x+.

I’ve been stuck on this for a few weeks, so hopefully this helps somebody out.

Maybe you need to modify some cmake files shipped with libtorch, please read this post 修复Windows下LibTorch+CUDA12.4执行CMake报错:Failed to find nvToolsExt as reference. (Maybe you need to use a translator if you are not familar with Chinese)