The pytorch blog "A Tour of PyTorch Internals" is out-of-date. How to know more about the pytorch internal


(Ke Bai) #1

I am start to read the source code of pytorch recently. But I am totally lost. In the tutorial, Part one introduces the “Tensor type”, but “csrc/generic/Tensor.cpp” disappeared. I only found “Tensor.h” in Aten library and in pytorch/torch/lib/THD/master_worker/worker/dispatch/ (But the later folder is claimed “dead” now)

I have read short introductions to the internal of pytorch in the document and tutorial. Now I only know that ATen is the tensor library of pytorch (The source code of this is easier to read since they have clear meanings). pytorch/crsc is the integration part of c++ and python.

I also have tried to find some hint from the GitHub and read the pull record. But it is quite hard.

I am familiar with the basic grammar of C++ but am bad at organization of the code. My own past C++ code only includes several xxx.h and xxx.c. So I am so confused when facing such a big project. Can anyone give me some suggestions to read the source code?

Thanks!


(Simon Wang) #2

Sorry for the post being out of date. Currently I don’t think there is a good guide on this topic especially since that we are in the middle of a big refactoring. We may have something once PyTorch 1.0 is out. May I ask why you want to read the source code?


(Jiaming Sun) #3

Hi, I’m also interested in reading the source code of PyTorch. As a C++/CUDA newbie, I also found it’s hard to start digging into the code.

The reasons are mainly because I want to understand deeper about PyTorch(and Python) and become a better user of it. I’m having a hard time figuring out where the variables will be in(CPU/GPU, which GPU etc.) and how the variables are passed(reference or copy) in memory when I’m writing data augmentation code on GPU. I’m also interested in the way how PyTorch interacts with Python and C++/CUDA code. And since it’s such a great library, maybe I could become a contributor when I have a better understanding of the internals.

It would be nice if there’s a developer getting started guide available.

Thanks for writing this amazing library:grinning:


(Simon Wang) #4

Thanks for explaining!

For a very quick and high-level description:

  1. Tensor operations are implemented in the C++ Tensor library called ATen, which also includes some old code from Torch time (we are rewriting those for PyTorch 1.0!). It’s important to note that ATen doesn’t have autograd functionality (at least not currently) and only provides a Tensor API and operators on Tensors of various types, e.g., one of them being CPUFloatTenor.

    Since there are many different Tensor types and different operators, a lot of codegen is applied here, which unfortunately makes the code harder to read for new contributors. But we are constantly improving upon it!

  2. To connect ATen with Python, we have a lot of C++ code living under torch/csrc, implementing things like autograd, jit, dtype device layout objects, etc. In particular, for autograd, we subclass each of the ATen tensor types with something called VariableType that implements functionalities to track computation graph (i.e., history), which the autograd engine will use to back-prop through and compute gradients.

    Similar to ATen, we also use some codegen to generate the operator interface on VariableTypes, and more importantly generate the backward pass for each operator. The mapping between forward and backward functions are listed in a yaml file, with many backward functions implemented in Functions.cpp. Note that all operations in Functions.cpp are operating on Variables, i.e., Tensors with VariableType, which means that if a function, e.g. my_op_backward, is implemented using existing operations that support autograd, the double backward also works out of the box! (Of course one may also implement a custom double backward for efficiency reasons.)

  3. Finally, the connection between C++ land (i.e., torch/csrc) and Python is mostly implemented with Python C API and pybind11.


Adding Distributed Model Parallelism to PyTorch
(Simon Wang) #5

Oh I just realized that you are not OP LOL. But hope my reply helps!