Getting gradients of a neural network w.r.t. to batched inputs

Hi everyone! I’m trying to implement a network where I can define derivatives of its output w.r.t. the inputs.

To make things clearer: I’m using the neural network to approximate a scalar-valued function

Since the output is a scalar, I can get the gradient w.r.t a given single input simply using

from torch.autograd import grad

x.requires_grad = True
u = model(x)
u_x = grad( u, x, retain_graph=True, 
            create_graph=True, 
            allow_unused=False)[0]

And this is very nice because I can use the gradient in my loss, for example by building up a regularization term, and I’m able to backpropagate through its graph.

But how can I extend this beyond this simple example? More specifically, I have the following couple of related questions:

  1. Is it possible to extend this beyond the first derivative? Assume that I want to calculate the Laplacian of f(x), how can I do that? I can’t run again the grad function like so
u_xx = grad( u_x, x, retain_graph=True, 
             create_graph=True, 
             allow_unused=False)[0]

since u_x is not a scalar and therefore the gradient doesn’t exist. What I would like to have is practically the diagonal of the Jacobian matrix, but how can I get it?

  1. Can I use batched inputs? Once more, with batched inputs I can’t run the grad function on the output because it will be a vector. Again, this is equivalent to ask for the diagonal of the Jacobian w.r.t. the inputs.
1 Like

Bumping this up, since I haven’t seen a solution yet (especially for pt. 2)