I am trying understand the differences between the various ways to compile/export a PyTorch model.
Background: My end goal is to export and use my detectron2 trained model as a TensorRT .engine file in NVIDIA frameworks, which got me into reading about TorchScript, torch.fx, torch.export, torch.compile, TorchDynamo with different backends e.g. TorchInductor and torch_tensorrt.
Since the model (mask2former with SWIN transformer backend) and the detectron2 framework in general have complex code constructs and dynamic control flow, I’ve ruled out torch.fx and all tracing methods (please correct me if my thinking is wrong).
I’m now left with these questions:
- Since my code has conditionals and dynamic control flow, is torch.jit.script the only “easy” option due to graph breaks and wanting to use it outside Python runtime?
- Is torch.compile with the PyTorch model as input suitable for my goal (eventually serializing to a TensorRT engine file and using it outside Python), or should I first convert the model to TorchScript?
- After compiling the model with any of the above methods, my understanding is I still need to use torch_tensorrt to serialize the model. Is there another way? I’ve stumbled upon the torch2rt project by NVIDIA but I’m not certain if it’s a better option.
Sorry for the long post and appreciate any help!