I am interested in building the libtorch library statically. To do this, I build the repo using the setup script by executing the python setup.py build command with the following environment variables:
However after a while I get many linker errors of this type:
LayoutManager.cpp:(.text+0x1690): multiple definition of torch::nativert::LayoutManager::deallocate_and_plan()'; test_nativert/CMakeFiles/test_nativert.dir/__/__/__/torch/nativert/executor/memory/LayoutManager.cpp.o:LayoutManager.cpp:(.text+0x2770): first defined here
collect2: error: ld returned 1 exit status
Many different functions seem to be defined in several parts. I have not added any code to the pytorch repo and thus this makes me very perplexed on why this happens. In general the process of building a static libtorch library is turning out to be a real pain. If anyone has some suggestions on how to do it, it would be greatly appreciated!
I have already tried it. When you do that you get a libtorch.so library. I have seen somewhere that if you change the “shared” with “static” in the link that you get the static variant. Unfortunately, that is not true libtorch.so is still there, all the other libraries are however static. What I want to achieve is to have a libtorch.a static library. I managed to compile it from source by setting the following environment variables:
After that I ran python setup.py build and I managed to build a libtorch.a. However, I still am having lots of issues like many of the operations are missing and after I try to load a model I trained. I get an error like this:
terminate called after throwing an instance of 'torch::jit::ErrorReport'
what():
Unknown builtin op: aten::mul.
Could not find any similar ops to aten::mul. This op may not exist or may not be currently supported in TorchScript.
:
File "<string>", line 3
def mul(a : float, b : Tensor) -> Tensor:
return b * a
~~~~~ <--- HERE
def add(a : float, b : Tensor) -> Tensor:
return b + a
'mul' is being compiled since it was called from 'gelu_0_9'
File "<string>", line 3
def gelu_0_9(self: Tensor) -> Tensor:
return torch.gelu(self, approximate='none')
~~~~~~ <--- HERE
It would be amazing to have more documentation and more support for static binaries. In our case we are trying to deploy our models on prem, it is much better to statically link the libraries.
Thank you for your help! I was not aware that ExecuTorch is now the go to method for deploying models. Most of the documentation and tutorials talk about TorchScript. I will take a look at ExecuTorch and hopefully this will solve my problems.