Hi,

I am trying to get the gradients of a function with multiple vector outputs. Let `x, y, z`

as input and `u,v,w`

as output, and they are all vectors with length `N`

. if I use `torch.autograd.grad`

, my code is like

```
def func(x):
do something fancy to x
def gradients(outpt, inpt):
return torch.autograd.grad(
outputs, inputs, grad_outputs=torch.ones_like(outpt), create_graph=True
)[0]
inpt = torch.stack((x, y, z), 1)
outpt = func(inpt)
grad = gradients(outpt, inpt)
```

Then the `grad`

is a `N*3`

shape matrix, and `grad[:, 0]`

is actually $u_x+u_y+u_z$ (the summation of partial derivatives of u to x, y and z). But I want to get $u_x, u_y, u_z$ separately. I have tried the `torch.autograd.functional.jacobian`

as following,

```
def func(x):
do something fancy to x
def jacobian(f, x):
return torch.sum(
torch.autograd.functional.jacobian(f, x, create_graph=True), axis=0
)
inpt = torch.stack((x, y, z), 1)
grad = jacobian(func, inpt)
```

This can give me the `grad`

contains separate gradients of `u, v, w`

to `x, y, z`

, the `grad`

size is `3*N*3`

, where $u_y$ (the derivative of u to y) is `grad[0, :, 1]`

, etc. But it will compute the derivatives for each single output element to each single input element, which takes too much GPU memory unnecessarily.

So, is there an elegant way to get the gradients I want with `torch.autograd`

? Thank you so much!