How to compute derivatives on multiple output in parallel

Isabella_Osei · June 29, 2024, 3:48pm

I’m working on a Physics Informed Neural Network that has two inputs and N outputs. The loss function contains high-order derivatives of the outputs with respect to the inputs x and y. Is it possible to compute the derivatives in parallel without a for loop?

def gradient(y, x, grad_outputs=None):
    if grad_outputs is None:
        grad_outputs = torch.ones_like(y)
    grad = torch.autograd.grad(y, [x], grad_outputs=grad_outputs, create_graph=True)[0]
    return grad


def compute_derivatives(x, y, u):
    dudx = gradient(u, x)
    dudy = gradient(u, y)

    dudxx = gradient(dudx, x)
    dudyy = gradient(dudy, y)

    dudxxx = gradient(dudxx, x)
    dudxxy = gradient(dudxx, y)
    dudyyy = gradient(dudy, y)

    dudxxxx = gradient(dudxxx, x)
    dudxxyy = gradient(dudxxy, y)
    dudyyyy = gradient(dudyyy, y)

    return dudxx, dudyy, dudxxxx, dudyyyy, dudxxyy

The code above works fine for a single output. For N outputs the shape of u is [batch_size, N] and I need to compute the derivatives for each column of u in parallel so that each derivative (ex. dudx, dudy…) shape matches the shape of u. Which is the most efficient way to do this? Thanks in advance.

soulitzer · July 1, 2024, 2:36pm

Would this work for your case torch.func.vmap — PyTorch 2.3 documentation?

Isabella_Osei · July 1, 2024, 3:42pm

Thank you, I am going to have a look again. I already tried to implement, but maybe I missed something.

Isabella_Osei · July 2, 2024, 10:18am

Hi, I tested your suggestion about torch.func.vmap. I started with a simplified application (as below) to see how each output was printed out:

import torch
from functorch import vmap

x = torch.tensor([1.0, 2.0], requires_grad=True)
out = torch.stack([x * 2, x * 3], dim=0)

print('x:', x)
print('out:', out)


def single_gradient(out_row, x):
    grad_outputs = torch.ones_like(out_row)
    return torch.autograd.grad(out_row, [x], grad_outputs=grad_outputs, create_graph=True, retain_graph=True)[0]


batched_grad = vmap(single_gradient, (0, None))(out, x)

print('Batched Grads:', batched_grad)

Above I am creating a vector of two numbers and one matrix with shape 2x2.
In theory, this should result in a differentiation of the matrix lines and give back another 2x2 matrix.

But I get this error:
“RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn”

The error occurs when I call autograd, so it seems that using vmap affects the computational graph. Do you have possible ideas?