I have a situation that looks something like this:
z = f(x_1, ..., x_n, ,y)
w = g(z)
I can’t detach
y because I need it for
In Tensorflow, I would create the variables under different scopes and then do something like
How would I handle this in torch?
What do you mean by this?
The argument that backward takes is the gradient of the tensor you call backward on.
What do you try to do here?
I’m a little new to PyTorch so I realize I am not quite using these methods correctly. I want y.grad to be the gradient dw/dy and the .grad property for all other variables, x_i, to be dz/dx_i. Does that make sense?
If both w and z are used to compute y and x_i, you will need to do two backward passes. One to get the gradients wrt to w and one to get the gradients wrt to z. You can either use the autograd.grad method to state explicitly the gradient of which element you want or simply save the .grad field of the variables you care about and ignore the other ones.
Thanks for the response. At this point, it might be clarifying to point you to a thread that I started after this one that has a few more details on my problem: Higher-order gradients w.r.t. different functions.
I am currently doing two backward passes, but in principle it seems like I should be able to do just one.