Implementing calculation of the Laplacian

AlphaBetaGamma96 · October 16, 2020, 3:49pm

Hi All,

I have a quick question regarding how to implement the Laplacian of the output of a network efficiently. I’ve managed to implement a method for calculating it, however, I’m pretty sure the way I’m doing it is a pretty inefficient way.

What I mean by the Laplacian of the output of the network is, let’s say I have a simple feed-forward network, y = model(x) and I wish to calculate Sum_{i} d2y_dx_{i}2 for all inputs samples.

My current code for calculating the Laplacian for an N-dimensional input x is

y = model(x) #where model is an R^N to R^1 function

laplacian = torch.zeros(x.shape[0]) #array to store values of laplacian

for i, xi in enumerate(x):
    hess = torch.autograd.functional.hessian(model, xi.unsqueeze(0), create_graph=True)
    laplacian[i] = torch.diagonal(hess.view(N, N) offset=0).sum()

I’ve noticed for large input samples (which I need) the memory usage increases drastically. I would assume this is because I create a graph for each call of torch.autograd.functional.hessian. I need this because I need to differentiate the Laplacian with respect to the network parameters in my loss function.

My question is, would it be possible to perhaps reduce the memory usage on calculating the hessian or perhaps re-use the same graph but just for different inputs? (if that’s the correct way of phrasing it?)

Many thanks in advance!

albanD · October 16, 2020, 4:02pm

The whole memory usage form the graph comes from intermediary results. So re-using the graph wouldn’t help.

I’m afraid that extracting the diagonal of the Hessian cannot be done more efficiently in general.
You could reduce the memory usage by doing a custom function that drops the non-diagonal elements earlier but that won’t be a huge change.

You might want to check packages like this one https://backpack.pt/ that provides approximate method to get the diagonal elements. If you are happy with an approximate value for your use case, it will be the fastest you can do!

AlphaBetaGamma96 · October 18, 2020, 1:53pm

Thanks for the quick response! On implementing a custom function, I did have a look at creating custom autograd functions which have forward and backward methods. I was wondering if it were at all possible to implement a backward of backward method in an autograd function?

What I mean by backward of backward would be an analytical expression for the 2nd order gradient of the loss function with respect to the output of the final layer. Because the final layer of my network is a determinant function, explicitly writing out those derivatives and calling them directly would be far more efficient than calling autograd and letting it solve it itself.

albanD · October 19, 2020, 3:56pm

The engine expects to only compute gradients. So you do have to do two backwards right now.

You can write these formulas for sure but it might be tricky to be re-use the current engine for this

rfmiotto · February 6, 2024, 5:33pm

Hi, AlphaBetaGamma96,

I am having the same issue here. Memory just blows up when computing the Laplacian.
Did you manage to find an efficient implementation for the Laplacian?
If so, could you please share it?

Best

AlphaBetaGamma96 · February 9, 2024, 2:10pm

Hi @rfmiotto,

You can do this quite efficiently within the torch.func namespace, which requires version torch2.1+. Here’s an example below for the laplacian with respect to an input x below,

import torch
from torch.func import hessian, vmap, functional_call

net = Model(*args, **kwargs) #our model

x = torch.randn(batch_size, num_inputs) #inputs
params = dict(net.named_parameters()) #params (need to be an input for torch.func to work)

#functionalize version (params are now an input to the model)
def fcall(parms, x):
  return functional_call(net, params, x)

def compute_laplacian(params, x):
  _hessian_wrt_x = hessian(fcall, argnums=1)(params, x) #forward-over-reverse hessian calc.
  _laplacian_wrt_x = _hessian_wrt_x.diagonal(0,-2,-1) #use relative dims for vmap (function doesn't see the batch dim of the input)
  return _laplacian_wrt_x

#We define a laplacian func for a single function, then vectorize over the batch.
laplacian = vmap(compute_laplacian, in_dims=(None,0))(params, x)

rfmiotto · July 7, 2025, 6:09pm

Thanks for sharing!
I’ll test your version soon to see if I can run the training without hitting the memory limit.

In the meantime, I’d like to share the approach I ended up taking to get around the memory problem: running using finite differences. I do a single forward pass, but triple the input size due to the stencil to calculate the numerical differentiation.
One can find the implementation here.