Compiled libtorch static but getting linking errors when trying to use it (CentOS)

Hi,
I’ve previously successfully compiled PyTorch 1.11 statically, and used libtorch from the resulting libraries to run PyTorch models from a C++ project.
Now I tested to update to PyTorch 1.13 compiled the exact same way (the whole pytorch build runs successfully), but I’m suddenly getting linking errors when I link against libtorch in my C++ project. I’m getting the following error:

/home/david/NeuralNets_software/pytorch/pytorch113_cuda112_static/lib/libtorch_cuda.a(Conv.cpp.o): In function `c10::SmallVector<long, (2)+(2)> MakeConvOutputShape<2>(int, int, std::array<long, 2> const&, std::vector<long, std::allocator<long> > const&, c10::List<long> const&, c10::List<long> const&, c10::List<long> const&)':
Conv.cpp:(.text+0x1089): multiple definition of `c10::SmallVector<long, (2)+(2)> MakeConvOutputShape<2>(int, int, std::array<long, 2> const&, std::vector<long, std::allocator<long> > const&, c10::List<long> const&, c10::List<long> const&, c10::List<long> const&)'
/home/david/NeuralNets_software/pytorch/pytorch113_cuda112_static/lib/libtorch_cpu.a(qconv.cpp.o):qconv.cpp:(.text+0x1553): first defined here

This is all compiling using GCC 7.3.1 (devtoolset-7) on CentOS.
The linking command in the cmake file is the following:
target_link_libraries(MyProject PRIVATE -L/home/david/NeuralNets_software/pytorch/pytorch113_cuda112_static/lib -L/home/david/NeuralNets_software/pytorch/pytorch113_cuda112_static/lib64 -L/usr/local/cuda-11.2/lib64 -L/usr/local/cuda-11.2/extras/CUPTI/lib64 -Wl,--whole-archive torch torch_cpu torch_cuda c10_cuda onnx magma -Wl,--no-whole-archive -lc10 -Wl,--start-group -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -Wl,--end-group -lmagma_sparse -lnnpack -lqnnpack -lpytorch_qnnpack -lXNNPACK -lpthreadpool -lcpuinfo -lcpuinfo_internals -lclog -lprotobuf -lprotoc -lonnx_proto -lfbgemm -lfoxi_loader -lcaffe2_protos -ldnnl -lpthread -ldl -lrt -l:libopenblas.a -lgomp -lsleef -lasmjit -lm -lcudart -lcublas -lcufft -lcurand -lcusparse -lcusolver -lcudnn -l:libcupti_static.a -lfmt -lnvrtc -lnvToolsExt -Wl,-rpath,$ORIGIN --verbose)

Any idea why this is happening? Am I doing something wrong, or is it a bug in PyTorch 1.13 when compiling it statically?

Cheers,
David

Is this something you could help me shed some light on @ptrblck ?
Why would the definitions clash between “libtorch_cuda” and “libtorch_cpu” all of a sudden?

Thanks in advance,
David

I don’t know as I haven’t tried this build before, but @malfet might have an idea.

Hi @malfet , do you have any idea why I’m facing this error above when trying to link to my build of a static pytorch 1.13? It has been working well with pytorch 1.11 (and also pytorch 1.8 before that).
This problem is really blocking my production currently, and I’ve been stuck for two weeks, so any help or insight would be very appreciated.

Best regards,
David

Hi,
I had the same error with the 2.0.1 version.
I found this commit, that says it solved the issue:

But it’s in the release 2.1

That’s great, thanks a lot @mpalya !
I managed to get it to build by adding “inline” to those template lines, which also seems to work. But it’s really good to know a proper fix for this, and I might swap to this solution instead since it’s way more correct. :slight_smile:
Cheers, David