Hi there,
Suppose we have a loss function: loss = f(x, y), with x and y be two parameters. We want to do the following two operations:
Are there any hints on how to implement this?
Hi there,
Suppose we have a loss function: loss = f(x, y), with x and y be two parameters. We want to do the following two operations:
Are there any hints on how to implement this?
Hi,
If you are trying to compute the second order partial derivative, maybe something like:
import torch
x = torch.tensor(2., requires_grad=True)
y = torch.tensor(3., requires_grad=True)
def f(a, b):
return a**2 * b
def df_da(a, b):
return torch.autograd.grad(f(a, b), (a,), create_graph=True)[0]
print(torch.autograd.grad(df_da(x, y), (y,))) # d^2f/dydx = 4
Thank you so much! Just one more question:
What if we want to use loss.backward()
and optimizer.step()
to update the parameter x? How to rewrite the expression: torch.autograd.grad(df_da(x, y), (y,))
?
Formally, we want to do:
I know the question is a bit blur. If you need any clarification, please let me know.
Thanks again!
No problem!
The direct analog is to just have torch.autograd.backward(df_da(x, y), inputs=(y,))
. Not specifying the inputs parameter should also be OK , though that would lead to the gradients also be accumulated for the other input (x
), which is a bit of extra computation. Then assuming you have your optimizer configured to update y
, optimizer.step()
should just work, i.e., it does something like y <- y + lr * y.grad