Is it possible to compile Pytorch model for GPU (for inference) and save compiled model in order to load the compiled model in future to avoid jit recompilation?
Yes! You wanna check out AOT inductor if you’re deploying in c++ https://github.com/pytorch/pytorch/tree/main/test/cpp/aot_inductor
If you’re still deploying with python something like this should work [draft] fsspec code cache by msaroufim · Pull Request #106501 · pytorch/pytorch · GitHub
Thank you! I found aot_compile
, compile_fx_aot
functions in pytorch python code:
-
https://github.com/pytorch/pytorch/blob/main/test/cpp/aot_inductor/test.py - AOT inductor test
-
https://github.com/pytorch/pytorch/blob/main/torch/_export/__init__.py#L414 -
aot_compile
: Traces either an nn.Module’s forward function or just a callable with PyTorch operations inside, generates executable cpp code from the program, and returns the path to the generated shared library -
https://github.com/pytorch/pytorch/blob/main/torch/_export/__init__.py#L168 -
export
function: returns ExportedProgram -
https://github.com/pytorch/pytorch/blob/main/torch/_inductor/compile_fx.py#L822 -
compile_fx_aot
: returns so path -
https://github.com/pytorch/pytorch/blob/main/torch/_inductor/__init__.py#L30 -
aot_compile
: Ahead-of-time compile a given FX graph with TorchInductor into a shared library
Yeah this is all very hot of the press, will probably get consolidated by next major release