Hello,

I’m working on a Physics Informed Neural Network and I need to take the derivatives of the outputs w.r.t the inputs and use them in the loss function.

The issue is related to the neural network’s multiple outputs. I tried to use ‘autograd.grad’ to calculate the derivatives of the outputs, but it sums all the contributions.

For example, if my output ‘u’ has shape [batch_size, n_output], the derivative ‘dudx’ has shape [batch_size, 1], instead of [batch_size, n_output].

Due to the sum, I can’t use the derivatives in the loss function. I tried with a for loop to calculate each derivative but the training takes forever. Do you have any idea how to solve this problem? Thanks in advance

You could have a look at using `torch.func.jacrev`

and `torch.func.vmap`

to compute the entire jacobian of your network.

Also, please share a minimal reproducible example to help explain your problem.

Hi,

thanks, you are right I should have posted some code.

below the code as per my loss function contains high-order derivatives of the outputs with respect to the inputs x and y

```
def gradient(y, x, grad_outputs=None):
if grad_outputs is None:
grad_outputs = torch.ones_like(y)
grad = torch.autograd.grad(y, [x], grad_outputs=grad_outputs, create_graph=True)[0]
return grad
def compute_derivatives(x, y, u):
dudx = gradient(u, x)
dudy = gradient(u, y)
dudxx = gradient(dudx, x)
dudyy = gradient(dudy, y)
dudxxx = gradient(dudxx, x)
dudxxy = gradient(dudxx, y)
dudyyy = gradient(dudy, y)
dudxxxx = gradient(dudxxx, x)
dudxxyy = gradient(dudxxy, y)
dudyyyy = gradient(dudyyy, y)
return dudxx, dudyy, dudxxxx, dudyyyy, dudxxyy
```

Also, I already tried vmap with a simplified version of code but it gives back the following error:

```
import torch
from functorch import vmap
x = torch.tensor([1.0, 2.0], requires_grad=True)
out = torch.stack([x * 2, x * 3], dim=0)
print('x:', x)
print('out:', out)
def single_gradient(out_row, x):
grad_outputs = torch.ones_like(out_row)
return torch.autograd.grad(out_row, [x], grad_outputs=grad_outputs, create_graph=True, retain_graph=True)[0]
batched_grad = vmap(single_gradient, (0, None))(out, x)
print('Batched Grads:', batched_grad)
```

“RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn”

What do you think about this?

The Jacobian requires as input a function, but my input is the output with shape [batch_size,N] from the neural network.

Given I do not have a function, can I still use Jacobian?

Using the `torch.func`

namespace requires you to re-write how you compute the gradients entirely, you can’t mix and match between `torch.autograd`

and `torch.func`

, especially when using `torch.func.vmap`

(at least to my knowledge)

I have some previous examples on the forums of how to compute gradients with a ‘functional’ approach here: