What is the purpose of `is_leaf`?

The docs don’t really indicate what is going on with is_leaf. In particular, I think the sentence that says “Only leaf Tensors will have their grad populated…” is misleading. From what I can guess, leaves don’t really have to do with populating grad; requires_grad is what governs that.

I think the is_leaf property is really about the reverse-graph. When x.backward() is called, all of the action happens on the “reverse-differentation mode” graph at x. x is the root, and the graph runs up along (against arrows in) the forward graph from x. While only tensors with requires_grad = True appear in the reverse-graph, the graph_fn of every tensor visited (even those with requires_grad = False) is used to produce the maps corresponding to the reverse-graph’s arrows. When the process hits a non-leaf, it knows it can keep mapping along to more nodes. On the other hand, when the process hits a leaf, it knows to stop; leaves have no graph_fn.

If this is right, it makes it more clear why weights are “leaves with requires_grad = True”, and inputs are “leaves with requires_grad = False.” You could even take this as a definition of “weights” and “inputs”.