The documentation writes that “only leaf tensors will have their grad populated during a call to backward”. Maybe we should add that there’s one additional condition: in order for their grad to be populated, the leaf tensors must also require grad?
Without it, the page is slightly confusing, since at the top it says “all tensors that have requires_grad which is False will be leaf Tensors by convention”.