# Getting a gradient d(df/dx)/df

Let us define the following real valued functions f and g as follows:

f(x) = y (for instance f(x) = x^2)
g(y) = dy/dx

The objective is to compute dg/dy.

The g(y) is computed using torch.autograd.grad. If I then want to compute dg/dy using torch.autograd.grad again, it fails with the message “RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.”

It seems that y is not part of the computation graph created after calling the first torch.autograd.grad, but I do not understand why. Here is a minimal example to reproduce the problem:

``````import torch
import torch.autograd as ag

# Input x
x = torch.empty((1, ), dtype=torch.float32).uniform_(-1., 1.)

# f(x) = y
y = x ** 2.

# g(y) = dy/dx
A = ag.grad(y, x, torch.ones((1, ), dtype=torch.float32), create_graph=True)[0]

# dg / dy - fails
B = ag.grad(A, y, torch.ones((1, ), dtype=torch.float32), create_graph=True)[0]
``````

I assumed that the the computation graph would look as follows:

``````  ________________
/   f         g   \|     dg/dy
x ------> y ------>  A ---------->  B
\_______________________/|
``````

But it seems that the branch connecting y to A does not exist (not created by the first call to torch.grad.autograd).

Would anyone have an idea what is going on here and how I could get such a derivative? Thank you!

Hi,

A is computed as `2 * x`. So A only depends on `x`, not `y`.
In particular, you might want to change your notation to avoid mixing symbolic variables and values.
If we write
`f: x -> y`. Then we can evaluate it at a given value `x0` to get `y0 = f(x0)`.
But then `A = (df/dx)(x0)` (the derivative of f evaluated at the point `x0`).
`y0` is not actually an input of `g` here (it is just an intermediary result), so writing `g(y)` (or `g(y0)`) does not really make sense. It is actually `g(x0)`.