I’d advise you use torch.func.hessian
as it’ll be significantly more efficient than an torch.autograd
approach, I have an example of the forums here: Efficient computation of Hessian with respect to network weights using autograd.grad and symmetry of Hessian matrix - #8 by AlphaBetaGamma96