Get the gradient of a function

zyl1024 · March 29, 2020, 12:05am

This should be a basic question, but maybe I am missing something. Is there a way to automatically derive an analytical function of the gradient, and use this function further?

For example, we have a vector $x$ and the sum of square function: (sorry don’t know how to insert latex math…)

$f(x) = \sum_i x_i^2$

We can define the gradient of $f(x)$ as

$\nabla f(x) = [2x_1, 2x_2, …, 2x_d]^T$

Then we can define a “sum of gradient” function as

$g(x) = \sum_i \nabla f(x)_i = \sum_i 2x_i$

which is, itself, a scalar function of $x$. But I don’t know how to implement $\nabla g(x)$ using pytorch’s auto-diff.

Currently I have,

import torch
def f(x):
    return torch.sum(x**2)
def grad(func, x):
    y = func(x)
    y.backward()
    gr = x.grad
    return gr
def sum_of_grad(func, x):
    gr = grad(func, x)
    return torch.sum(gr)
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
s = sum_of_grad(f, x)
s.backward()
print(x.grad)

But s.backward() will cause an error because gr = x.grad does not have gradient information. So is what I am trying to do compatible with pytorch’s autodiff infrastructure or is it not supported? I know autodiff is technically not the same as symbolic differentiation so I won’t be surprised if this is not possible, but just want to make sure.

albanD · March 29, 2020, 1:33am

Hi

If you want to be able to backprop through another backprop, you have to run the first one with create_graph=True (see the doc for more details).
Also I would recommend using autograd.grad when doing higher order derivatives as it avoid any issue with multiple backward accumulating gradients at the same place.

def grad(func, x):
    y = func(x)
    gr = autograd.grad(y, x, create_graph=True)[0]
    return gr