How to avoid sum from ‘autograd.grad’ output in Physics Informed Neural Network?

Hello,
I’m working on a Physics Informed Neural Network and I need to take the derivatives of the outputs w.r.t the inputs and use them in the loss function.
The issue is related to the neural network’s multiple outputs. I tried to use ‘autograd.grad’ to calculate the derivatives of the outputs, but it sums all the contributions.
For example, if my output ‘u’ has shape [batch_size, n_output], the derivative ‘dudx’ has shape [batch_size, 1], instead of [batch_size, n_output].
Due to the sum, I can’t use the derivatives in the loss function. I tried with a for loop to calculate each derivative but the training takes forever. Do you have any idea how to solve this problem? Thanks in advance

You could have a look at using torch.func.jacrev and torch.func.vmap to compute the entire jacobian of your network.

Also, please share a minimal reproducible example to help explain your problem.

Hi,
thanks, you are right I should have posted some code.
below the code as per my loss function contains high-order derivatives of the outputs with respect to the inputs x and y

def gradient(y, x, grad_outputs=None):
    if grad_outputs is None:
        grad_outputs = torch.ones_like(y)
    grad = torch.autograd.grad(y, [x], grad_outputs=grad_outputs, create_graph=True)[0]
    return grad


def compute_derivatives(x, y, u):
    dudx = gradient(u, x)
    dudy = gradient(u, y)

    dudxx = gradient(dudx, x)
    dudyy = gradient(dudy, y)

    dudxxx = gradient(dudxx, x)
    dudxxy = gradient(dudxx, y)
    dudyyy = gradient(dudy, y)

    dudxxxx = gradient(dudxxx, x)
    dudxxyy = gradient(dudxxy, y)
    dudyyyy = gradient(dudyyy, y)

    return dudxx, dudyy, dudxxxx, dudyyyy, dudxxyy

Also, I already tried vmap with a simplified version of code but it gives back the following error:

import torch
from functorch import vmap

x = torch.tensor([1.0, 2.0], requires_grad=True)
out = torch.stack([x * 2, x * 3], dim=0)

print('x:', x)
print('out:', out)


def single_gradient(out_row, x):
    grad_outputs = torch.ones_like(out_row)
    return torch.autograd.grad(out_row, [x], grad_outputs=grad_outputs, create_graph=True, retain_graph=True)[0]


batched_grad = vmap(single_gradient, (0, None))(out, x)

print('Batched Grads:', batched_grad)

“RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn”
What do you think about this?

The Jacobian requires as input a function, but my input is the output with shape [batch_size,N] from the neural network.
Given I do not have a function, can I still use Jacobian?

Using the torch.func namespace requires you to re-write how you compute the gradients entirely, you can’t mix and match between torch.autograd and torch.func, especially when using torch.func.vmap (at least to my knowledge)

I have some previous examples on the forums of how to compute gradients with a ‘functional’ approach here: