Reduce output dimensions of batch Hessian without looping

Hi all,

I used the method here to compute batch Hessian of the output of a neural network f with respect to its input at given batch points x, where x is of the shape [B, N], with B being the batch size and N being the input dimension of f.

The code produces a tensor of shape [B, N, B, N], where Output [i, :, i, :] =Hessian of f at x[i, :]. I further reduce the Hessian tensor in a desired shape [B, N, N], by looping over range (B) and setting Final_output [i, :, :] = Output [i, :, i, :].

My question is whether it is possible to perform such a reduction without using loop?