How do I calculate the gradients of a non-leaf variable w.r.t to a loss function?

If my understanding is correct, calling the .backward() of a Variable only generates the gradients of the leaf nodes. Is there any way to calculate the gradients w.r.t to an arbitrary Variable, sort of like torch.gradients(loss, variables)


Hello @Abhai_Kollara

The proper solution is to use .retain_grad()

v = torch.autograd.Variable(torch.randn(3), requires_grad=True)
v2 = v+1

Apparently, this is common enough.

This is what I had posted before I knew better:

how about using hooks, e.g.

v = torch.autograd.Variable(torch.randn(3), requires_grad=True)
def require_nonleaf_grad(v):
     def hook(g):
         v.grad_nonleaf = g
v2 = v+1

I don’t recommend calling it .grad to not collide with pytorch internals.

Best regards



Hello @tom

I want to get the gradients of the discriminator network w.r.t the fake variable generated by the generator net. I tried both of your solutions, by using the first way I get None for the gradient value and by the second solution I get the following error:
AttributeError: ‘Tensor’ object has no attribute ‘grad_nonleaf’.

Do you have any idea of what’s wrong?

Most likely, your generator output doesn’t require gradients, maybe put it in eval or set requires_grad=False on its parameters.

Best regards


Dear Thomas

Thanks for your posting. I am also encountering the ‘optimize non-leaf variable’ issue and I am grategul if you can provide some feedback to my practice.

What I want to do is using a linear combination (learnable) of basis filters for convolution operation and optimize only the linear coefficients as below:

class net(torch.nn.Module):
    def __init__():
        fitler_basis = [f_height, f_weight, num_filter];
        self.coeff = [num_filter, c_in, c_out]; # To be optimized

        self.conv2d = torch.nn.conv2d(c_in, c_out, strides=1, padding=1)
        self.conv2d.weight = torch.matmul(filter_basis, coeff) # linear combinaton of pre-fixed filter as weights;

    def forward(self, data):
        return self.conv2d(data)

    def train(self):
        loss = ...
        optim(self.coeff, loss)

How should I achieve my goal ?

Perfect solution for getting gradients of non-leaf variables in autograd