As with most other auto-differentiation libraries, pytorch doesn’t require you to explicitly define a graph variable which records computations (computational graph).
I was wondering how does pytorch autograd accomplish this? Does a tensor produced as a result of a tensor operation store references to its children tensors? Or is there a global variable that holds all tensors which tensor handles can point to? Or some other way?
I don’t have a good knowledge of C++, so I can’t look into the codebase to understand it, but maybe there’s a good conceptual explanation that’s independent of the implementation language?
Thanks in advance