torch::jit::script::Module::Forward failes (LoadLibraryA) on c++ static lib for windows

Hi,

I am trying to train on linux (python) and do inference on windows with c++ static lib application.
When calling torch::jit::script::Module::Forward(), following error occurs.
The application with dll does not fail.

The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/Model.py", line 37, in forward
    _19 = (_6).forward((_7).forward((_8).forward(_18, ), ), )
    input0 = torch.cat([(_5).forward(_19, ), _15], 1)
    _20 = (_3).forward((_4).forward(input0, ), )
                        ~~~~~~~~~~~ <--- HERE
    _21 = (_2).forward((_14).forward2(_20, ), )
    return (_0).forward((_1).forward(_21, ), )
  File "code/__torch__/CompactModel.py", line 36, in forward
    _18 = (_9).forward((_10).forward(_17, ), )
    _19 = (_6).forward((_7).forward((_8).forward(_18, ), ), )
    input0 = torch.cat([(_5).forward(_19, ), _15], 1)
             ~~~~~~~~~ <--- HERE
    _20 = (_3).forward((_4).forward(input0, ), )
    _21 = (_2).forward((_14).forward2(_20, ), )

Traceback of TorchScript, original code (most recent call last):
/docker_share/source/Model.py(193): forward
/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py(534): _slow_forward
/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py(548): __call__
/usr/local/lib/python3.8/site-packages/torch/jit/__init__.py(1027): trace_module
/usr/local/lib/python3.8/site-packages/torch/jit/__init__.py(873): trace
./source/jitrace.py(700): <module>
RuntimeError: error in LoadLibraryA

The error message changed from “LoadLibraryA” to “GetProcAddress” when I placed “caffe2_nvrtc.dll” next to the application. caffe2_nvrtc.dll is created under build/bin and caffe2_nvrtc.lib is not created.
Is caffe2_nvrtc.dll related this problem ?

Do you have any suggestion ?
Thank you.

Please copy all the DLLs to the directory of your application.

There are NO DLLs in my application dir.
The cuda and cudnn libraries are in the $PATH.

Although I do not want to use dynamic link library,
application succeeded by below.

  • link official distributed “caffe2_nvrtc.lib”
  • place “caffe2_nvcrt.dll” next to the application.

So, to succeed with static library, I tried below and it didn’t work.

  • change CMakeFiles.txt under “caffe2” directory as below and run build.
565c565
<     add_library(caffe2_nvrtc SHARED ${ATen_NVRTC_STUB_SRCS})
---
>     add_library(caffe2_nvrtc ${ATen_NVRTC_STUB_SRCS})
  • caffe2_nvrtc.lib is created (and caffe2_nvrtc.dll is not).
  • link it to the app.

But it does work if “caffe2_nvrtc.dll” (the official one) is placed next to the application.

The difference between the loaded DLLs when the application succeeds or fails are :

  • caffe2_nvrtc.dll
  • nvrtc64_101_0.dll

So what should I do to succeed with static cuffe2_nvrtc.lib?

Hey, we don’t provide static libs currently. You’ll have to build that from source.

Unlike the relations between .so and .a on Linux, .lib files don’t necessarily refer to the file name of a static lib. It can also be an import library for the DLL.

Hey, we don’t provide static libs currently. You’ll have to build that from source.

Let me rephrase what I wanted to say…
What I want to do is to create successful static version of libtorch.
I have built the static libtorch from source with “set BUILD_SHARED_LIBS=OFF” BUT

  • cuffe2_nvrtc.lib is NOT created.
  • caffe2_nvrtc.dll is created under torch/bin.

My assumption is below.
“SHARED” is set in add_library and because of that, DLL is created. At the same time,
BUILD_SHARED_LIBS=OFF and dllexport is not defined and lib is not created. Files under lib are linked to the application and error occurs.

I now know by using official caffe2_nvrtc.lib and caffe2_nvrtc.dll, the application succeeds.
I’d like to know how to create static libtorch library.

I’ve done rewriting CMakeList.txt and built static version of libtorch and

  • caffe2_nvrtc.lib is created (and this should be static lib, right?)
  • caffe2_nvrtc.dll is NOT created.

By linking above to the app, it will end up with the error I wrote in the beginning.
By placing official caffe2_nvrtc.dll next to the app, it works. (meaning static lib is not created correctly ?)

Any suggestion to succeed my attempt?

Maybe some of the libs are optimized away. You could try passing /WHOLEARCHIVE:caffe2_nvrtc.lib in your project to force the linker to stop doing that.

I thought the same way.

  • /WHOLEARCHIVE:caffe2_nvrtc.lib
  • link C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\lib\x64\nvrtc.lib

I tried above and still get the same error.
The application does not load “nvrtc64_101_0.dll”

Any other suggestion?

use WSL2 to run ubuntu on WINDOW!

You may need to link other CUDA libs as well, like cudart64_xxx.lib. But some of them doesn’t have static libs so you still need some DLLs.

Well, I don’t think WSL2 is a good choice for deployment.

I’ve tried few other things and below change on caffe2/CMakeList.txt worked!

diff --git a/caffe2/CMakeLists.txt b/caffe2/CMakeLists.txt
index 8025a7de3c..8e94978e72 100644
--- a/caffe2/CMakeLists.txt
+++ b/caffe2/CMakeLists.txt
@@ -561,7 +561,7 @@ if (NOT INTERN_BUILD_MOBILE OR NOT BUILD_CAFFE2_MOBILE)
       ${TORCH_SRC_DIR}/csrc/cuda/comm.cpp
       ${TORCH_SRC_DIR}/csrc/jit/tensorexpr/cuda_codegen.cpp
     )
-    add_library(caffe2_nvrtc SHARED ${ATen_NVRTC_STUB_SRCS})
+    add_library(caffe2_nvrtc ${ATen_NVRTC_STUB_SRCS})
     target_link_libraries(caffe2_nvrtc ${CUDA_NVRTC} ${CUDA_CUDA_LIB} ${CUDA_NVRTC_LIB})
     target_include_directories(caffe2_nvrtc PRIVATE ${CUDA_INCLUDE_DIRS})
     install(TARGETS caffe2_nvrtc DESTINATION "${TORCH_INSTALL_LIB_DIR}")
@@ -703,6 +703,9 @@ ELSEIF(USE_CUDA)
   cuda_add_library(torch_cuda ${Caffe2_GPU_SRCS})
   set(CUDA_LINK_LIBRARIES_KEYWORD)
   torch_compile_options(torch_cuda)  # see cmake/public/utils.cmake
+  if (NOT BUILD_SHARED_LIBS)
+    target_compile_definitions(torch_cuda PRIVATE USE_DIRECT_NVRTC)
+  endif()
   if (USE_NCCL)
     target_link_libraries(torch_cuda PRIVATE __caffe2_nccl)

In aten/src/ATen/cuda/detail/CUDAHooks.cpp, “#ifdef USE_DIRECT_NVRTC” directive is used.
But “USE_DIRECT_NVRTC” was not defined in any CMakeList.txt and because of that,
application linked with satic libtorch tries to load “caffe2_nvrtc.dll”.

Interesting facts. Thanks for the finding.