If I’m understanding correctly, you want to compute the second-order gradients wrt w. So you’d like to have phi.grad itself to have an autograd graph accumulating into w (and thus have requires_grad=True).
You could do this by doing wp.backward(create_graph=True).
I believe the quote is saying that .backward() is hard to reason about (not .grad()). .grad() is actually the preferred alternative because (by default) it’s more explicit about what inputs its computing gradients for. Its also returns the gradient instead of performing a side effect like updating .grad.