Get the gradient of a function

This should be a basic question, but maybe I am missing something. Is there a way to automatically derive an analytical function of the gradient, and use this function further?

For example, we have a vector $x$ and the sum of square function: (sorry don’t know how to insert latex math…)

$f(x) = \sum_i x_i^2$

We can define the gradient of $f(x)$ as

$\nabla f(x) = [2x_1, 2x_2, …, 2x_d]^T$

Then we can define a “sum of gradient” function as

$g(x) = \sum_i \nabla f(x)_i = \sum_i 2x_i$

which is, itself, a scalar function of $x$. But I don’t know how to implement $\nabla g(x)$ using pytorch’s auto-diff.

Currently I have,

import torch
def f(x):
    return torch.sum(x**2)
def grad(func, x):
    y = func(x)
    gr = x.grad
    return gr
def sum_of_grad(func, x):
    gr = grad(func, x)
    return torch.sum(gr)
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
s = sum_of_grad(f, x)

But s.backward() will cause an error because gr = x.grad does not have gradient information. So is what I am trying to do compatible with pytorch’s autodiff infrastructure or is it not supported? I know autodiff is technically not the same as symbolic differentiation so I won’t be surprised if this is not possible, but just want to make sure.


If you want to be able to backprop through another backprop, you have to run the first one with create_graph=True (see the doc for more details).
Also I would recommend using autograd.grad when doing higher order derivatives as it avoid any issue with multiple backward accumulating gradients at the same place.

def grad(func, x):
    y = func(x)
    gr = autograd.grad(y, x, create_graph=True)[0]
    return gr