Decomposition to Aten IR

Hey @Jerome_Ku. The short answer is:

We have a handful of operator decompositions that always run by-default, inside of the dispatcher (in C++), before making it into __torch_dispatch__.

linear/matmul is by far the most common

If you’re wondering why, the historical answer is that there are a number of ops that we don’t have dedicated derivative formulas for (e.g. linear), and so rather than writing a brand new formula, we just have the autograd engine decompose the op into more primitive ops that it does have formulas for (e.g. aten.mm and transpose).

If you are interested in export and only care about inference, we actually recently made it so that exporting for inference can preserve all ATen ops, including these special ops like linear, in the graph:

m = torch.nn.Linear(...)
graph_module = torch.export.export(m, args).run_decompositions().module()
# you should see aten.linear, as long as you didn't manually specify that you want it decomposed
print(graph_module)