Is this the correct way?


Suppose we have a expression: z = x^2 *y + y, where x, y are both Tensors.

We want to calculate d(d z / d x) / dy. Here d is the partial derivative operator.

I do as follows:

import torch

x = torch.Tensor([3])
y = torch.Tensor([17])


loss = x**2 * y + y
loss.backward(create_graph=True, retain_graph=True)

xx = x.grad

result = torch.autograd.grad(xx, y)

This piece of code can get correct result. However, I wander if this is the correct way to calculate, or I just get the correct result by coincidence.


This is correct. The x.grad tensor is the dz/dx term. By setting the create_graph parameter to True, the graph of the derivatives is created and a backward function is attached to the x.grad tensor. Hence, deriving gradient of x.grad w.r.t y is the correct way to calculate the value of d(dz/dx)/dy