Hi,
I am trying to get the gradients of a function with multiple vector outputs. Let x, y, z
as input and u,v,w
as output, and they are all vectors with length N
. if I use torch.autograd.grad
, my code is like
def func(x):
do something fancy to x
def gradients(outpt, inpt):
return torch.autograd.grad(
outputs, inputs, grad_outputs=torch.ones_like(outpt), create_graph=True
)[0]
inpt = torch.stack((x, y, z), 1)
outpt = func(inpt)
grad = gradients(outpt, inpt)
Then the grad
is a N*3
shape matrix, and grad[:, 0]
is actually $u_x+u_y+u_z$ (the summation of partial derivatives of u to x, y and z). But I want to get $u_x, u_y, u_z$ separately. I have tried the torch.autograd.functional.jacobian
as following,
def func(x):
do something fancy to x
def jacobian(f, x):
return torch.sum(
torch.autograd.functional.jacobian(f, x, create_graph=True), axis=0
)
inpt = torch.stack((x, y, z), 1)
grad = jacobian(func, inpt)
This can give me the grad
contains separate gradients of u, v, w
to x, y, z
, the grad
size is 3*N*3
, where $u_y$ (the derivative of u to y) is grad[0, :, 1]
, etc. But it will compute the derivatives for each single output element to each single input element, which takes too much GPU memory unnecessarily.
So, is there an elegant way to get the gradients I want with torch.autograd
? Thank you so much!