Dispatch ops flow and dtensor

Can someone please explain how the control flow for ops works: from python to how it gets lowered into C++/cuda and back? How does this work for dtensors? Also, where are the ops for DTensors defined and is there documentation for it?

This might be helpful Let’s talk about the PyTorch dispatcher : ezyang’s blog

Subclasses like DTensor plug into the PyTorch dispatcher at the Python key.

For DTensor specifically, further dispatching is defined in torch/distributed/_tensor/dispatch.py.

1 Like