Torch::jit::script::module vs torch::nn::Module performance

This tutorial describes how you can train a torch.nn.Module in python and save to disk using torch.jit.trace(). In c++, then, you can load this module and use it for inference.

Compared to re-implementing the same python neural network architecture in c++ as a torch::nn::Module, this is quite convenient. However, I cannot help but wonder whether this convenience comes at a cost. For instance, at the minimum, it seems to me that reconstructing the neural network architecture at c++ runtime must entail gluing together components via vtable lookups, something an equivalent hand-crafted torch::nn::Module compiled by a modern compiler could potentially avoid. Another possibility is if the compiled version has more streamlined memory access patterns than its jit counterpart.

It may very well be that whatever costs exist are dwarfed by other factors in real world applications. Still, I am curious whether any noteworthy tradeoffs exist, and if so, what their nature is.