Assume we have y as a function of x, y = f(x), and z as a function of y, z = g(y).

How can I compute the gradient w.r.t. y first (dz/dy, use this to do something else), and then compute the gradients w.r.t. x (dy/dx)?

Any ideas would be helpful, thanks!

Hello,

is this approximately what you need: let’s say you have

```
from torch.autograd import Variable
x = Variable(torch.randn(4), requires_grad=True)
y = f(x)
y2 = Variable(y.data, requires_grad=True) # use y.data to construct new variable to separate the graphs
z = g(y2)
```

(there also is Variable.detach, but not now)

Then you can do (assuming z is a scalar)

```
z.backward() # this computes dz/dy2 in y2.grad
y.backward(y2.grad) # this computes dy/dx * y2.grad
print (x.grad)
```

Note that the `.backward`

evaluates the derivative at the last forward computation.

(I hope this is correct, I don’t have access to my pytorch right now.)

Best regards

Thomas

2 Likes

Thanks!

This seems a proper way to do. I’ll try it.