Gradient w.r.t inputs

I currently have a model that outputs a single regression target with mse loss. I can get the derivatives with respect to the inputs like so

x = x.requires_grad_(True).cuda()
output = model.eval()(None, x)
output[0].backward()
x.grad[0]

However this works one element at a time, if x is a batch, then

x = x.requires_grad_(True).cuda()
output = model.eval()(None, x)
output.backward()
x.grad[0]

fails with a runtime error that implicit derivatives are only supported for scalar outputs. I have 2 questions:

  1. How to get a batch of target + it’s derivatives wrt to inputs?
  2. How to get higher order derivates wrt to inputs?

Important to note that if I aggregate the loss then that works, but I need the specific input <> loss pair derivatives not the averaged one.

Here seems to be an example of that in tensorflow, what would the Pytorch equiv be ?

def fwd_gradient(func, x, input_gradients=None, use_gradient_tape=False):