This should be a basic question, but maybe I am missing something. Is there a way to automatically derive an analytical function of the gradient, and use this function further?
For example, we have a vector $x$ and the sum of square function: (sorry don’t know how to insert latex math…)
$f(x) = \sum_i x_i^2$
We can define the gradient of $f(x)$ as
$\nabla f(x) = [2x_1, 2x_2, …, 2x_d]^T$
Then we can define a “sum of gradient” function as
$g(x) = \sum_i \nabla f(x)_i = \sum_i 2x_i$
which is, itself, a scalar function of $x$. But I don’t know how to implement $\nabla g(x)$ using pytorch’s auto-diff.
Currently I have,
import torch def f(x): return torch.sum(x**2) def grad(func, x): y = func(x) y.backward() gr = x.grad return gr def sum_of_grad(func, x): gr = grad(func, x) return torch.sum(gr) x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True) s = sum_of_grad(f, x) s.backward() print(x.grad)
s.backward() will cause an error because
gr = x.grad does not have gradient information. So is what I am trying to do compatible with pytorch’s autodiff infrastructure or is it not supported? I know autodiff is technically not the same as symbolic differentiation so I won’t be surprised if this is not possible, but just want to make sure.