Torch.Compile different fx graph for inductor and custom backend

Tiru_B · January 3, 2024, 10:40am

Hi

I have registered custom backend in pytorch 2.0 and running GPT-J model for both inductor and custom backed named ABCD.

tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/tiny-random-gptj")
model = GPTJForCausalLM.from_pretrained("hf-internal-testing/tiny-random-gptj")
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs, labels=inputs["input_ids"])
input_ids = inputs["input_ids"]
start_time = time.time()
print("cpu.................")
fn_cpu = torch.compile(model.generate,backend="inductor")
with torch.no_grad():
    output_cpu = fn_cpu(input_ids, do_sample=True, temperature=0.9, max_length=200)
print(output_cpu)
print("--- %s pytorch with compile CPU seconds ---" % (time.time() - start_time))
torch._dynamo.reset()

Above same code i ran it for backend ABC

now I am capturing fx grahp for both the backends

For Inductor you can see below code

 def __call__(self, model_, inputs_):
        from torch._inductor.compile_fx import compile_fx
        print("CPU Graphs")
          model_ir = model_.print_readable(print_output=False)
          with open(f"pt_graph_fwd_cpu.ir", 'w') as file:
              file.write(model_ir)
          from torch.fx.passes import graph_drawer
          gd = graph_drawer.FxGraphDrawer(model_, 'f')
          pydot_graph = gd.get_dot_graph()
          pydot_graph.write_png(f"pt_graph_fwd_cpu.png")

        return compile_fx(model_, inputs_, config_patches=self.config) ```

for custom backed end 

```@register_backend
def ABC_backend(model:GraphModule, inputs:List[FakeTensor]):
    compiled_graph = None
    def fwd(*args):
        nonlocal model
        nonlocal compiled_graph
        if compiled_graph is None:
        model_ir = model_.print_readable(print_output=False)
        with open(f"pt_graph_fwd_cpu.ir", 'w') as file:
            file.write(model_ir)
        from torch.fx.passes import graph_drawer
        gd = graph_drawer.FxGraphDrawer(model_, 'f')
        pydot_graph = gd.get_dot_graph()
        pydot_graph.write_png(f"pt_graph_fwd_cpu.png")
            compiled_graph = ABCDBACKENDCLASS(model,inputs,args)
            del model
        return compiled_graph(*args)

    return fwd```

Now this issues is , I am seeing different fx graphs for both flows, and what i have observed is in Inductor flow compile_fx is being called multiple times so all the sub grahps are are merged where as for ABC backend i can see only once its being called.