Support loading and executing a ExportedProgram from torch.export with forward + backward + optimizer step in C++ environment

Hi all, we are currently working on an online ML platform in the company which require us to

  1. export a pytorch model graph and variable into an IR which can be executed in c++ environment
  2. The graph can be executed with both forward pass, backward pass and optimizer to update the variables
  3. We need to get the updated variables from training process, and send to another inference service to execute the forward pass graph with updated variables

I did some doc and code search and torch.export seems to be the closest way to achieve this, but there are some gaps, not sure if I missed anything

  1. torch.export can only export forward pass and cannot export forward + backward + optimizer step all into the same graph . The backward graph is executed eagerly after loaded back in python environment (checked the latest pytorch 2.5 doc here). Also the AOT autograd seems to be exporting
  2. In order to run the graph in C++, We can only compile the graph into aot_inductor and put that into .so file, there is not C++ API to load the exported graph and programically call this graph
  3. There is no way to call this compute graph while passing variable update to the compute graph

Do we have any plans to extend torch export to support such functionalities?