Developer documentation for PyTorch 2.x compiler

anubane · November 10, 2023, 12:28pm

Could someone please point me to the developer docs for dynamo, inductor?

Please DO NOT give me the user doc links currently available on the official PyTorch docs page. I need to understand the complete flow of dynamo so as to extend it for my own purpose.

Specifically, I need to understand how does dynamo capture the forward and backward graphs when training any random model - how it captures conditional branches, loops and dynamic tensor shapes.

Any help is welcome.

anubane · November 14, 2023, 1:56pm

I was following the example at this PyTorch tutorial page

There is some discrepency: where as the tutorial states that upon executing the command:

TORCH_COMPILE_DEBUG=1 python example.py

the generated output_code.py file must contain the Triton IR, but I am seeing C++ code instead:

cpp_fused_add_cos_sin_0 = async_compile.cpp('''
#include "/tmp/torchinductor_<username>/ib/<a_random_hash_like_string>.h"
extern "C" void kernel(const float* in_ptr0,
                       float* out_ptr0)
{
    {
        for(long i0=static_cast<long>(0L); i0<static_cast<long>(10000L); i0+=static_cast<long>(16L))
        {
            auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(i0));
            auto tmp1 = tmp0.cos();
            auto tmp2 = tmp0.sin();
            auto tmp3 = tmp1 + tmp2;
            tmp3.store(out_ptr0 + static_cast<long>(i0));
        }
    }
}
''')

Embedded within other generated Python code.

What am I missing?

bdhirsh · November 15, 2023, 2:36am

@anubane inductor (the default compiler) will emit cpp code if its given cpu tensor inputs, and it will emit triton if it’s given cuda tensor inputs

xgbj · January 23, 2024, 3:12am

support with @anubane’s suggestion. I want to know if there are any good ways to learn torch.compile. It’s not easy to analyze and debug after encountering problems with compilation.

marksaroufim · January 23, 2024, 4:38am

We’ve tried to recently aggregate some

dev docs here torch.compiler — PyTorch main documentation
There’s also the 2.0 series here https://www.youtube.com/watch?v=v4nDZTK_eJg&list=PL_lsbAsL_o2CQr8oh5sNWt96yWQphNEzM
The PyTorch dev podcast https://pytorch-dev-podcast.simplecast.com/

For some community tutorials I personally really liked

https://x.com/tarantulae/status/1734192354716725401?s=20
How Pytorch 2.0 Accelerates Deep Learning with Operator Fusion and CPU/GPU Code-Generation | by Shashank Prasanna | Towards Data Science

I understand the sentiment of torch.compile is complicated but then again compilers are generally complicated but torch.compile has going for it that it’s mostly written in python. If there’s concrete parts of the stack you find hard to parse and would like more tutorials written about please let me know

xgbj · January 23, 2024, 4:46am

Thanks a lot! this is exactly what I wanted.

anubane · January 30, 2024, 6:02am

Thanks a lot for the pointers!

I would specifically like to understand the codegen part of torch-inductor along with an overall code structure understanding for the Inductor module.

marksaroufim · January 30, 2024, 6:07am

Here you go https://www.youtube.com/watch?v=p13HpZv2S3Q

bdhirsh · January 30, 2024, 2:32pm

There are also two (excellent) podcasts on Inductor internals here:

Inductor IR: Spotify

Details on inductor’s define-by-run IR semantics: Spotify

anubane · January 31, 2024, 8:09am

I believe I have been unable to explain my question:

I am NOT looking for developer docs from the perspective of model developers who USE PyTorch.

I am looking for developer docs, design docs from the perspective of PyTorch core developers - the logic and working of the source code - that is what I need to understand and modify.

Most of the links here explain things from the pespective of PyTorch users.

(Further, I prefer docs/videos over podcasts since podcasts make it hard to follow the actual source code)

Currently, the only method to understand this seems to build PyTorch in debug mode and step through the inductor module to understand the flow. This can be done but is tedious, that is why I was looking for some documentation.

(the question was cross posted on the developer forum)

marksaroufim · January 31, 2024, 9:50pm

The videos and podcasts we shared were indeed targeted to new PyTorch developer. Honestly I only started to learn how PT2 worked after filing and fixing some random easy bugs Issues · pytorch/pytorch · GitHub otherwise a debugger sounds great, that’s how I ramp up on new projects

anubane · February 1, 2024, 5:30am

I understand that for a beginner like me, there will always be a learning curve. However, since PT2.x is new, and the components in question are different from PT1.x, I was hoping to catch up sooner. The resources shared here are great and give a holistic overview, but, since they move on to performance comparisons and benefits to end-users of PT, the purpose to me is lost.

I am trying to extend PT to a custom backend; while a public API is available for Dynamo, similar API is not publicly available for Inductor. Therefore I needed to understand the module itself.

For example, in the video:

The logic of inductor is explained, but, that is not enough for me to understand the actual source code, it’s structure - what goes in the codegen module, what goes in the scheduler module, how to extend that, etc.

anubane · February 6, 2024, 6:59am

Found a relevant post: What should we do about developer documentation?