Pytorch 2 and the c++ interface

Is pytorch 2 transition of some parts from c++ back to python, means that in future releases there will be no C++ front end for pytorch?

This post might be relevant.

Thanks, But still I’m not sure what would happen to the c++ API of pytorch in 2 series!

If I understand well, the design of models in C++, Java, Rust, or any langage using a binding to the C++ API is at a dead end, since it will never benefit from the new compilation features introduced in Pytorch 2.

I hope we will still be able, using the C++ API:

  • to import models designed and compiled in Python,
  • to add some input or output modules to compiled modules,
  • to train the result
  • to infer.

@ptrblck , @Chillee can you confirm ? Any news about the import of compiled modules and the ability to integrate them in C++ models ?

I am very interested in the future of the C++ API too.

There are a number of different use cases for C++ frontend, which are worth stepping through individually, since PT2 has different implications for them.

Write PyTorch-style code in C++. These users of the C++ API liked PyTorch’s Python API and want to directly code their models the same way they did in Python, but using torch::empty(), C++ NN Modules, etc in C++, for lower overhead or removal of the GIL. There is no way for models written in this way to directly use Dynamo, since Dynamo is entirely predicated on Python bytecode analysis, and we no plans for actually solving this. Additionally, in the limit, PT2 is supposed to remove all of the Python-side overhead that might have originally induced you to port your code to C++, so if you don’t have requirements for Python-less deploy (more on this below), we would hope that the next models you write can be done back in Python.

That being said, it is still possible to make use of PT2 as a tool. You have a few pathways for doing this:

  • You have identified a region of your graph which can profitably be compiled end-to-end with Inductor. You can capture these operators in Python, and then have Inductor export the fused kernel ahead-of-time, to be invoked from C++. This does not exist today but is on our roadmap for this half.
  • You could use lazy tensor to capture all of the operations and then hand it to our compiler stack. The compiler stack is still in Python, but at runtime, in principle, Python can be excluded from the hotpath. You would run into some trouble if you needed dynamic shapes, but C++ code can be manually rewritten to symbolically trace integers if necessary.

C++ API as a deployment mechanism. We fully intend to support this via the “export” workflow. In export, we trace an entire model written in Python and produce it some serialization format, which can be loaded by a C++ runtime to be executed. The outputted model may or may not have had optimizations applied to it; this is up in the air. Our current work is on serialization to mobile devices, where Inductor-style compilation doesn’t make sense, but in this half we are also working on server-side export. You should be able to chain these models with other modules to the result.

Hope that helps.

3 Likes

Thank you for this detailed answer.
In the middle of your two use cases, there is a third one where you need to train in c++/java/rust/… a model that has been, at least for its major parts (the backbone), written and optimized in python.
Will that be supported too ?

Thanks for the answer too! Follow up question:
Will the at / torch interface still exist in C++, so I am able to work on tensors and their ops?

1 Like