Libtorch jit behaves differently from pytorch jit

hugary1995 · August 18, 2024, 12:31am

Hello All,

This is my first post, and I’ve read the community guidelines before posting.

I have been using pytorch and libtorch for quite a while now, so I’d say I am reasonably familiar with both. I’ve developed some models in libtorch, and I am trying to jit trace them into torch script to take advantage of all the optimizations that come with it. It is of my interest to do the tracing and graph execution on the C++ side, because not all users of my library have access to the Python interpreter.

I have received some success so far: The traced function can be executed and generates the correct results if the input shape does not change (i.e., is the same as the example input used to trace the function). However, I run into various issues when generalizing to other input shapes. All issues seem to be somewhat related to the sizes – the generated code seems to have compiled the tensor shapes into fixed, static values.

Even though my models are complicated, I managed to come up with the following MRE. The python version works just fine, i.e., the traced function can be generalized to different input tensor shapes:

import torch


def f(x):
    return torch.sum(x.reshape(x.shape[0], 3, 2), -1)


m = torch.jit.trace(f, torch.rand(1, 6))

y = m(torch.rand(1, 6))
print(y.shape)

y = m(torch.rand(2, 6))
print(y.shape)

On the other hand, its C++ equivalent does not:

    #include <torch/csrc/jit/frontend/tracer.h>
    #include <torch/csrc/jit/runtime/graph_executor.h>

    auto var_name_lookup_fn = [](const torch::autograd::Variable & /*var*/) -> std::string
    { return ""; };

    auto f = [](torch::jit::Stack inputs) -> torch::jit::Stack
    {
      auto x = inputs[0].toTensor();
      return {torch::sum(x.reshape({x.size(0), 3, 2}), -1)};
    };

    auto [state, out] = torch::jit::tracer::trace(torch::jit::Stack{torch::rand({1, 6})},
                                                  f,
                                                  var_name_lookup_fn,
                                                  /*strict=*/true,
                                                  /*force_outplace=*/false);

    auto executor = torch::jit::GraphExecutor(
        state->graph, "model.value", torch::jit::ExecutorExecutionMode::PROFILING);

    auto stack1 = torch::jit::Stack{torch::rand({1, 6})};
    executor.run(stack1);
    std::cout << stack1[0].toTensor() << std::endl;

    auto stack2 = torch::jit::Stack{torch::rand({2, 6})};
    executor.run(stack2);
    std::cout << stack2[0].toTensor() << std::endl;

The interpreter is able to execute the graph with shape (1,6) which is what I used to trace the function. But it yields the following error message if the input tensor has shape (2,6):

  The following operation failed in the TorchScript interpreter.
  Traceback of TorchScript (most recent call last):
  RuntimeError: shape '[1, 3, 2]' is invalid for input of size 12

Any help or pointer would be appreciated!