If they don’t, just call requires_grad_() on them before giving them to your net
If they do and are leaf (inp1.is_leaf()) then you’re good to go
If they do but are not leafs, you can do inp1.retain_grad() to make sure the .grad field will be populated properly.
Then if output is a scalar, you can simply .backward() on the output.
Otherwise, if you want the full jacobian, you can use the torch.autograd.functional package from pytorch 1.5.
If you want the sum of the gradients for each element in the output, you can do output.sum().backward().
Many thanks for the response. May you elaborate on what you mean by leaf? I will try to do it along the lines you suggested and come back to you, Thanks again
A leaf Tensor is a Tensor that does not have history. You can check it with your_tensor.is_leaf().
When calling .backward(), it will populate the .grad field of all the leaf Tensors that require gradients that were use in the computation of the output.
I get the following error at the line where I have: inpt2_grad = grad(output, inp2, torch…
Mismatch in shape: grad_output[0] has a shape of torch.Size([15625, 1]) and output[0] has a shape of torch.Size([]).
The issue here is that the grad_output should be of the same size as the output. In this case, the output is a scalar so you can actually leave the grad_output field empty to get a Tensor containing a single 1.
Autograd.grad does not populate the .grad field of the Tensor, it returns the gradients directly and you should get it the same way you do in your post above. Note that it returns a tuple though so you might need to do inpt2_grad, = xxx.
I did the following:
inpt2_grad = grad(output, inpt2, create_graph=True, retain_graph=True)
But now I am encountering a new issue:
One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
Any insights of what allow_unused does? I tried to read the documentation, but I got lost.
Also, if I add the allow_unused=True statement, grad(output,…) returns None.
This error means that pytorch cannot find any link between output and inpt2. Meaning that output was not computed in a differentiable way based on inpt2.
So you want to double check your code to make sure that output does depend on input2 (or if it does not, you can remove this call and replace it by a Tensor full of 0s).
So let me layout the code, and you might be able to tell me if there is something wrong:
class Grad_finder:
def fun1(x1, x2, x3):
x1.requires_grad_(True)
x1.retain_grad()
x2.requires_grad_(True)
x2.retain_grad()
'''
we minimize the loss
'''
for i in range(epochs):
L1 = f(x1, x3) # f is a neural network model
L2 = g(L1, x2) # g is a defined function
L3 = h(x1, x3) # h is a defined function
loss = L2 - L3
optimizer.zero_grad()
loss.backward(retain_graph=True)
optimzer.step()
# now f is trained
L1 = f(x1, x3)
L2 = g(L1, x2)
return L2
def fun2(x1, x2, x3):
x1.requires_grad_(True)
x1.retain_grad()
x2.requires_grad_(True)
x2.retain_grad()
L2 = fun1(x1, x2, x3)
'''
Here i want to find :
dL2/dx2
'''
return dL2dx2
I think this is what I am missing: the relationship between L2 and x2. They are implicitly related. I have to revisit the problem and see how I can proceed. Many thanks, albanD.