Second order derivatives?

yoonholee · January 26, 2017, 11:07am

How can I calculate gradients with respect to a loss defined by the gradients?
For example, in tensorflow, I could call the function tf.gradients() two times, using the first result in the second function call.

My specific problem is this, I am implementing TRPO, and I have:
flat_grad <-flattened gradients of network parameters w.r.t. a loss
x <- a tensor with the same shape as flat_grad

and I need the gradients of the network parameters w.r.t (flat_grad * x)

In the process of flattening the gradients, I had to convert everything into a numpy array, which broke the backprop chain. How can I solve this problem?

apaszke · January 26, 2017, 11:13am

You can flatten the gradients using torch.cat([g.view(-1) for g in grads], 0). This is supported by autograd.

Regarding taking grad of grad, we will support it (that’s why var.grad is a Variable too), but it’s still work in progress. At this point there’s no way to do this, but hopefully it will be implemented in the next month. Sorry!