Is Dynamo a good fit for high-level graph manipulation?

kmaeng · February 21, 2024, 7:14pm

Dear experts:

I have been using torch.fx for high-level graph manipulation using its proxy/retracing (torch.fx — PyTorch 2.2 documentation).

I found that Dynamo support graph replacement using its subgraph_rewriter in here: Google Colab
I am curious if this is strictly superior and is intended to replace torch.fx’s proxy/retracing.

Originally, I thought this is strictly better because now I don’t have to write my own detection and replacement logic. However, I realized that the subgraph_rewriter wants me to define operators like below:

mul_2 = torch.ops.aten.mul.Tensor(x, 0.5)
mul_3 = torch.ops.aten.mul.Tensor(x, 0.7071067811865476)
erf = torch.ops.aten.erf.default(mul_3)
add_1 = torch.ops.aten.add.Tensor(erf, 1)
mul_4 = torch.ops.aten.mul.Tensor(mul_2, add_1)

Using torch.fx’s proxy, I was able to express my logic using plain PyTorch language (e.g., 0.5 * x * (1 + torch.erf(x / sqrt(2))).

Is Dynamo rewriter only useful for low-level maniputation and should I keep using torch.fx for high-level manipulation? Or is can I use a similar proxy trick inside Dynamo as well?
I am not really interested in optimizing for a specific backend (i.e., not interested in low-level stuffs). I am only interested in playing around at the IR level.