torch::jit::script::Module::Forward failes (LoadLibraryA) on c++ static lib for windows

tkr · June 18, 2020, 8:17pm

Hi,

I am trying to train on linux (python) and do inference on windows with c++ static lib application.
When calling torch::jit::script::Module::Forward(), following error occurs.
The application with dll does not fail.

The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/Model.py", line 37, in forward
    _19 = (_6).forward((_7).forward((_8).forward(_18, ), ), )
    input0 = torch.cat([(_5).forward(_19, ), _15], 1)
    _20 = (_3).forward((_4).forward(input0, ), )
                        ~~~~~~~~~~~ <--- HERE
    _21 = (_2).forward((_14).forward2(_20, ), )
    return (_0).forward((_1).forward(_21, ), )
  File "code/__torch__/CompactModel.py", line 36, in forward
    _18 = (_9).forward((_10).forward(_17, ), )
    _19 = (_6).forward((_7).forward((_8).forward(_18, ), ), )
    input0 = torch.cat([(_5).forward(_19, ), _15], 1)
             ~~~~~~~~~ <--- HERE
    _20 = (_3).forward((_4).forward(input0, ), )
    _21 = (_2).forward((_14).forward2(_20, ), )

Traceback of TorchScript, original code (most recent call last):
/docker_share/source/Model.py(193): forward
/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py(534): _slow_forward
/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py(548): __call__
/usr/local/lib/python3.8/site-packages/torch/jit/__init__.py(1027): trace_module
/usr/local/lib/python3.8/site-packages/torch/jit/__init__.py(873): trace
./source/jitrace.py(700): <module>
RuntimeError: error in LoadLibraryA

The error message changed from “LoadLibraryA” to “GetProcAddress” when I placed “caffe2_nvrtc.dll” next to the application. caffe2_nvrtc.dll is created under build/bin and caffe2_nvrtc.lib is not created.
Is caffe2_nvrtc.dll related this problem ?

Do you have any suggestion ?
Thank you.

peterjc123 · June 20, 2020, 4:31am

Please copy all the DLLs to the directory of your application.

tkr · June 20, 2020, 4:24pm

There are NO DLLs in my application dir.
The cuda and cudnn libraries are in the $PATH.

Although I do not want to use dynamic link library,
application succeeded by below.

link official distributed “caffe2_nvrtc.lib”
place “caffe2_nvcrt.dll” next to the application.

So, to succeed with static library, I tried below and it didn’t work.

change CMakeFiles.txt under “caffe2” directory as below and run build.

565c565
<     add_library(caffe2_nvrtc SHARED ${ATen_NVRTC_STUB_SRCS})
---
>     add_library(caffe2_nvrtc ${ATen_NVRTC_STUB_SRCS})

caffe2_nvrtc.lib is created (and caffe2_nvrtc.dll is not).
link it to the app.

But it does work if “caffe2_nvrtc.dll” (the official one) is placed next to the application.

The difference between the loaded DLLs when the application succeeds or fails are :

caffe2_nvrtc.dll
nvrtc64_101_0.dll

So what should I do to succeed with static cuffe2_nvrtc.lib?

peterjc123 · June 20, 2020, 4:56pm

Hey, we don’t provide static libs currently. You’ll have to build that from source.

peterjc123 · June 20, 2020, 4:59pm

Unlike the relations between .so and .a on Linux, .lib files don’t necessarily refer to the file name of a static lib. It can also be an import library for the DLL.

tkr · June 20, 2020, 6:04pm

Hey, we don’t provide static libs currently. You’ll have to build that from source.

Let me rephrase what I wanted to say…
What I want to do is to create successful static version of libtorch.
I have built the static libtorch from source with “set BUILD_SHARED_LIBS=OFF” BUT

cuffe2_nvrtc.lib is NOT created.
caffe2_nvrtc.dll is created under torch/bin.

My assumption is below.
“SHARED” is set in add_library and because of that, DLL is created. At the same time,
BUILD_SHARED_LIBS=OFF and dllexport is not defined and lib is not created. Files under lib are linked to the application and error occurs.

I now know by using official caffe2_nvrtc.lib and caffe2_nvrtc.dll, the application succeeds.
I’d like to know how to create static libtorch library.

I’ve done rewriting CMakeList.txt and built static version of libtorch and

caffe2_nvrtc.lib is created (and this should be static lib, right?)
caffe2_nvrtc.dll is NOT created.

By linking above to the app, it will end up with the error I wrote in the beginning.
By placing official caffe2_nvrtc.dll next to the app, it works. (meaning static lib is not created correctly ?)

Any suggestion to succeed my attempt?

peterjc123 · June 21, 2020, 8:03am

Maybe some of the libs are optimized away. You could try passing /WHOLEARCHIVE:caffe2_nvrtc.lib in your project to force the linker to stop doing that.

tkr · June 21, 2020, 5:08pm

I thought the same way.

/WHOLEARCHIVE:caffe2_nvrtc.lib
link C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\lib\x64\nvrtc.lib

I tried above and still get the same error.
The application does not load “nvrtc64_101_0.dll”

Any other suggestion?

Scott_Hoang · June 21, 2020, 5:11pm

use WSL2 to run ubuntu on WINDOW!

peterjc123 · June 22, 2020, 3:08am

You may need to link other CUDA libs as well, like cudart64_xxx.lib. But some of them doesn’t have static libs so you still need some DLLs.

peterjc123 · June 22, 2020, 3:16am

Well, I don’t think WSL2 is a good choice for deployment.

tkr · June 22, 2020, 3:01pm

I’ve tried few other things and below change on caffe2/CMakeList.txt worked!

diff --git a/caffe2/CMakeLists.txt b/caffe2/CMakeLists.txt
index 8025a7de3c..8e94978e72 100644
--- a/caffe2/CMakeLists.txt
+++ b/caffe2/CMakeLists.txt
@@ -561,7 +561,7 @@ if (NOT INTERN_BUILD_MOBILE OR NOT BUILD_CAFFE2_MOBILE)
       ${TORCH_SRC_DIR}/csrc/cuda/comm.cpp
       ${TORCH_SRC_DIR}/csrc/jit/tensorexpr/cuda_codegen.cpp
     )
-    add_library(caffe2_nvrtc SHARED ${ATen_NVRTC_STUB_SRCS})
+    add_library(caffe2_nvrtc ${ATen_NVRTC_STUB_SRCS})
     target_link_libraries(caffe2_nvrtc ${CUDA_NVRTC} ${CUDA_CUDA_LIB} ${CUDA_NVRTC_LIB})
     target_include_directories(caffe2_nvrtc PRIVATE ${CUDA_INCLUDE_DIRS})
     install(TARGETS caffe2_nvrtc DESTINATION "${TORCH_INSTALL_LIB_DIR}")
@@ -703,6 +703,9 @@ ELSEIF(USE_CUDA)
   cuda_add_library(torch_cuda ${Caffe2_GPU_SRCS})
   set(CUDA_LINK_LIBRARIES_KEYWORD)
   torch_compile_options(torch_cuda)  # see cmake/public/utils.cmake
+  if (NOT BUILD_SHARED_LIBS)
+    target_compile_definitions(torch_cuda PRIVATE USE_DIRECT_NVRTC)
+  endif()
   if (USE_NCCL)
     target_link_libraries(torch_cuda PRIVATE __caffe2_nccl)

In aten/src/ATen/cuda/detail/CUDAHooks.cpp, “#ifdef USE_DIRECT_NVRTC” directive is used.
But “USE_DIRECT_NVRTC” was not defined in any CMakeList.txt and because of that,
application linked with satic libtorch tries to load “caffe2_nvrtc.dll”.

peterjc123 · June 22, 2020, 3:23pm

Interesting facts. Thanks for the finding.