Hi everyone! I’m trying to implement a network where I can define derivatives of its output w.r.t. the inputs.

To make things clearer: I’m using the neural network to approximate a scalar-valued function

Since the output is a scalar, I can get the gradient w.r.t a given **single** input simply using

```
from torch.autograd import grad
x.requires_grad = True
u = model(x)
u_x = grad( u, x, retain_graph=True,
create_graph=True,
allow_unused=False)[0]
```

And this is very nice because I can use the gradient in my loss, for example by building up a regularization term, and I’m able to backpropagate through its graph.

But how can I extend this beyond this simple example? More specifically, I have the following couple of related questions:

- Is it possible to extend this beyond the first derivative? Assume that I want to calculate the Laplacian of
`f(x)`

, how can I do that? I**can’t**run again the`grad`

function like so

```
u_xx = grad( u_x, x, retain_graph=True,
create_graph=True,
allow_unused=False)[0]
```

since `u_x`

is not a scalar and therefore the gradient doesn’t exist. What I would like to have is practically the diagonal of the Jacobian matrix, but how can I get it?

- Can I use batched inputs? Once more, with batched inputs I can’t run the
`grad`

function on the output because it will be a vector. Again, this is equivalent to ask for the diagonal of the Jacobian w.r.t. the inputs.