Getting the gradient of the gradient

zanellar · June 13, 2023, 1:00pm

I have a function f(x) which includes an approximation with Taylor expansion, e.g. f(x,a)=g(a)+g’(a)(x-a) where g’(x)=dg(x)/dx is computed with torch.autograd. Now I would like to differentiate it in “a”, i.e. df(x,a)/da. To simplify even more the function let’s say that f(a)=g’(a):

def g(x):
return x

def f(a):
return torch.autograd.functional.jacobian(g, a)

If I visualize the computational graph, it seems that I lose the gradient when computing the Jacobian.
In particular, using these lines I get an empty graph:

a = torch.ones(1, requires_grad=True)
torchviz.make_dot(f(a), params=dict(a=a))

Now, my question is: how can I implement this computation?

KFrank · June 14, 2023, 3:32am

Hi Riccardo!

If I understand your use case, I would use autograd.grad() twice, with
create_graph = True for the first call so that autograd will be able to
differentiate the derivative in the second call.

Here is an example script where g is assumed to be a scalar function:

import torch
print (torch.__version__)

def g (x):
    return  x**3

def f (x, a):
    ga = g (a)
    gpa = torch.autograd.grad (ga, a, create_graph = True)[0]
    return  ga + gpa * (x - a)

x = torch.tensor ([2.1])
a = torch.tensor ([2.0], requires_grad = True)
fxa = f (x, a)
print ('fxa =', fxa)

fpa = torch.autograd.grad (fxa, a)[0]
print ('fpa =', fpa)

And here is its output:

2.0.0
fxa = tensor([9.2000], grad_fn=<AddBackward0>)
fpa = tensor([1.2000])

Best.

K. Frank

zanellar · June 16, 2023, 2:24pm

Thank you very much!!