How to do multiple partial deriative of a loss funciton w.r.t. different parameters?

phantom90 · June 4, 2021, 3:02am

Hi there,

Suppose we have a loss function: loss = f(x, y), with x and y be two parameters. We want to do the following two operations:

Are there any hints on how to implement this?

soulitzer · June 4, 2021, 3:14am

Hi,

If you are trying to compute the second order partial derivative, maybe something like:

import torch

x = torch.tensor(2., requires_grad=True)
y = torch.tensor(3., requires_grad=True)

def f(a, b):
  return a**2 * b

def df_da(a, b):
  return torch.autograd.grad(f(a, b), (a,), create_graph=True)[0]

print(torch.autograd.grad(df_da(x, y), (y,))) # d^2f/dydx = 4

phantom90 · June 4, 2021, 3:26am

Thank you so much! Just one more question:

What if we want to use loss.backward() and optimizer.step() to update the parameter x? How to rewrite the expression: torch.autograd.grad(df_da(x, y), (y,))?

Formally, we want to do:

I know the question is a bit blur. If you need any clarification, please let me know.

Thanks again!

soulitzer · June 4, 2021, 4:17am

No problem!

The direct analog is to just have torch.autograd.backward(df_da(x, y), inputs=(y,)). Not specifying the inputs parameter should also be OK , though that would lead to the gradients also be accumulated for the other input (x), which is a bit of extra computation. Then assuming you have your optimizer configured to update y, optimizer.step() should just work, i.e., it does something like y <- y + lr * y.grad