Semantics of autograd.grad with `only_inputs=False`

When you call autograd.grad function with only_inputs=False, does it fill in the .grad member for all of the variables in the graph, or only the variables that are needed to do reach the inputs. Is this behavior well-defined ? Is there a way to have autograd compute the grad of everything that is not before a certain variables (for instance to implement lazy truncated BPTT in an RNN).

only_inputs=False for torch.autograd.grad now seems to be deprecated (although it is still described as a feature in the master docs for version 0.4.0). When used it issues a warning (see torch/autograd/

only_inputs argument is deprecated and is ignored now (defaults to True). To accumulate gradient for other parts of the graph, please use torch.autograd.backward.”