Compiling LibTorch from sources working recipe

I wonder if there are any tested recipes available to compile LibTorch from sources that result in the exactly same ready-to-use package as one that is linked on the main page here?

Instructions mentioned here don’t really work well. For example, for the image nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04 while trying to build a release branch:

WORKDIR /workspace
RUN git clone -b v2.0.0 --recurse-submodule https://github.com/pytorch/pytorch.git
WORKDIR /workspace/pytorch/build
ARG TORCH_CUDA_ARCH_LIST="Ampere"
RUN cmake \
    -DBUILD_SHARED_LIBS:BOOL=ON \ 
    -DCMAKE_BUILD_TYPE:STRING=Release \
    -DPYTHON_EXECUTABLE:PATH=`which python3` \
    -DBUILD_PYTHON:BOOL=OFF \
    -DCMAKE_INSTALL_PREFIX:/workspace/libtorch .. && \
    cmake --build . --target install --config Release -- -j$(nproc)

installation won’t results all the shared libraries created, e.g. libnvfuser_codegen.so is missing, etc.
There are a few related and still un-answered questions: this and this.

Thanks in advance!

The pytorch/builder repository contains the scripts we use to build these binaries so you could take a look at it.

1 Like

Thanks! Indeed, as I thought the build’s logic copies everything was compiled.

Maybe worth adding a mention of it inside torch’s repo docs to not confuse others in the future?

This might be a good idea! Would you be interested in updating the docs?

Well, before doing so I’d need to find our why the docker cmake command mentioned above leads to the libtorch build with which on a simple toy example I get:

what():  Type c10::intrusive_ptr<LinearPackedParamsBase> could not be converted to any of the known types.
Exception raised from operator() at /workspace/pytorch/aten/src/ATen/core/jit_type.h:1793 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x60 (0x7fca98ab29a0 in /app/libs/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11:

issue ? This one looks to be similar but already fixed…