Libtorch on Windows prebuilt binaries questions

Hi all,

I am using prebuilt libtorch libraries for Windows in my C++ programs, everything works just fine. Two questions here:

  1. How are the cuda files (*.cu) compiled in these binaries? I am especially interested in the -gencode parameter values passed to NVCC compiler when creating these binaries. This will tell me, on which GPU platforms the libraries will run. For example, is there also a PTX version embedded?

  2. The prebuilt libs in libtorch for Windows are shared, I’ve tried to compile from source to create a static torch.lib, but after tweaking build parameters here and there, I failed. Can anyone share a way how to create static torch.lib for Windows?

Thanks.

Alex

Hi all,

is it Christmas or are you so much discriminating against people using Windows? :wink: I hoped to receive some help here. At least the first question should be very easy to solve, especially for those who build the official libraries…

Thanks.

Alex

Sorry, I didn’t follow posts here too often. Let me answer your questions:

  1. It will cover all the ARCHs greater than or equal to CC 3.5. You could find the complete arch list here for all the CUDA specs we support: https://github.com/pytorch/builder/blob/master/conda/pytorch-nightly/bld.bat#L20
  2. Sorry for keep you trying. Actually I haven’t tried that before. I’ll send back when I figured that out.

Hi, thank you very much.

  1. Ok, I assume that “+PTX” means that a PTX version gets also embedded into the binaries, so that all future ARCHs (not known at compile time) are also supported, right? Good approach, IMHO.

  2. That would be great. Any info / hints / recommendations are much appreciated, I gave up after two days struggling either with compiler errors or with not all functions being exported correctly…

Alex.

I’ve tested the static build and it should be fixed. (However, there is currently a bug in torch/CMakeLists.txt#L268-269. It could be solved by swapping the two lines. PR https://github.com/pytorch/pytorch/pull/15989 is on this.) The step is simply setting set EXTRA_CAFFE2_CMAKE_FLAGS= -DTORCH_STATIC=1 before build.

Thanks. I realized, however, that I have more problems with building libtorch from source than I originally thought ;). I am following the guide for Windows here: https://github.com/pytorch/pytorch#from-source, “python setup.py install” runs just fine and I get my libs and dlls in the build directory. However:

  1. When I try to link my C++ app against torch.lib etc., I get a lot of unresolved symbols like exports would be missing from the lib. I can, however, successfully link against the libs from the official libtorch binaries, so I was wondering how these libs are built.

  2. For some reason, torch.dll that gets created does not depend on caffe2_gpu.dll despite compiling with cuda and cudnn (observed using Dependancy walker).

Due to the complexity of the Pytorch project and lack of usable documentation on this topic, I do not seem to be able to find my problems here. I tried to tweak CMakeLists.txt in various ways like option(BUILD_TORCH “Build Torch” ON (was OFF), but without success.

Any hints?

Alex

When I try to link my C++ app against torch.lib etc., I get a lot of unresolved symbols like exports would be missing from the lib. I can , however, successfully link against the libs from the official libtorch binaries, so I was wondering how these libs are built.

Our nightly build scripts could be found here. If you are using the static libs, linking against torch.lib is not enough. Please link to all the libs under the lib directory except those for the tests. The official libtorch libs are shared libraries, so if you link against torch only, it’s acceptable.

For some reason, torch.dll that gets created does not depend on caffe2_gpu.dll despite compiling with cuda and cudnn (observed using Dependancy walker).

Yes, this is a bug. And I’m fixing it with this PR: Fix the caffe2_gpu linkage with torch on Windows by peterjc123 · Pull Request #16071 · pytorch/pytorch · GitHub.

Hi,

ok, I can confirm, the dependancy to caffe2_gpu is recreated, thanks for the fix. I am still unable to link against the libs built from source (and I am, of course, linking against all the libs, not just torch.lib), so for now, I gave up and will stick to the official libs that do the work for me. Just FYI, I get basic linkage errors like:

LNK2001 unresolved external symbol “__declspec(dllimport) public: virtual void __cdecl torch::nn::Module::eval(void)” (_imp?eval@Module@nn@torch@@UEAAXXZ)

During the build I saw (apart from hundreds of other marginal issues), also many suspicious warnings about dllimport/dllexport mismatchs and stuff like that.

Thanks for your patience, help and contributions for Windows users ;).

Alex

Glad it’s working for you. As for the static build, I guess you’ll need to append -DTORCH_BUILD_STATIC_LIBS to the compiler flags when compiling your cpp extension that uses libtorch.

Hi Alex -
I am also attempting to use the libtorch prebuilt binaries in Windows for C++ programs.
I am getting hung up on some CMake issues. Which compiler did you use? gcc or msvc? Do you have any pointers on how to get this to build successfully?

I am using MSVC 2017 and (predictably) am having issues with pthreads.
Thanks,
Rob

Hi Rob,

I am using msvc. Which CMake issues are you experiencing? When using the prebuilt binaries I didn’t have any problem at all (however, it is not completely clear which gpu archs are supported). Building from source, I had only two issues:

  1. I had to remove AT_CPP14_CONSTEXPR from this file, msvc just doesn’t seem to take it.

  2. I had to edit the torch_cuda_get_nvcc_gencode_flag macro in utils.cmake because I believe that there was something wrong with inferring the -gencode parameters from the TORCH_CUDA_ARCH_LIST variable. I observed some malformed -gencode output, so I just hardcoded the archs I need.

The rest went ok. I also highly recommend to use the Ninja build system, it will be a lot faster.

Alex

I am also getting LNK2001 unresolved external symbol “__declspec(dllimport) kind of error . How did you remove this.