The problem I am trying to solve has the following setup. Let f(x; theta)
and g(y; phi)
be two neural networks whose outputs are scalar and theta, phi
are the parameters I want to learn. x,y
are vector valued inputs to these networks. In order to train the parameters, the loss function I have is the following:
The issue I have is that since the loss L
already has gradients with respect to input y
in the form of nabla g(y)
, I am wondering if it’s possible to compute the gradient again with respect to both theta
and phi
. The tentative code I have is the following but I am not sure if it’ll work:
x, y = Variable(torch.randn(1,2), requires_grad= True), Variable(torch.randn(1,2), requires_grad= True)
# Assuming that f_theta and g_phi are predefined neural networks
g_y = g_phi(y)
g_phi(y).backward()
grad_g_y = y.grad
f_grad_g_y = f_theta(grad_g_y)
f_x = f_theta(x)
loss = f_grad_g_y - f_x - torch.sum(y*grad_g_y ,dim=1)
# Computing gradient with respect to theta and phi
loss.backward()
print(theta.grad, phi.grad)