This should be a basic question, but maybe I am missing something. Is there a way to automatically derive an analytical function of the gradient, and use this function further?
For example, we have a vector $x$ and the sum of square function: (sorry don’t know how to insert latex math…)
$f(x) = \sum_i x_i^2$
We can define the gradient of $f(x)$ as
$\nabla f(x) = [2x_1, 2x_2, …, 2x_d]^T$
Then we can define a “sum of gradient” function as
$g(x) = \sum_i \nabla f(x)_i = \sum_i 2x_i$
which is, itself, a scalar function of $x$. But I don’t know how to implement $\nabla g(x)$ using pytorch’s auto-diff.
Currently I have,
import torch
def f(x):
return torch.sum(x**2)
def grad(func, x):
y = func(x)
y.backward()
gr = x.grad
return gr
def sum_of_grad(func, x):
gr = grad(func, x)
return torch.sum(gr)
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
s = sum_of_grad(f, x)
s.backward()
print(x.grad)
But s.backward()
will cause an error because gr = x.grad
does not have gradient information. So is what I am trying to do compatible with pytorch’s autodiff infrastructure or is it not supported? I know autodiff is technically not the same as symbolic differentiation so I won’t be surprised if this is not possible, but just want to make sure.