If my understanding is correct, calling the `.backward()`

of a `Variable`

only generates the gradients of the leaf nodes. Is there any way to calculate the gradients w.r.t to an arbitrary Variable, sort of like `torch.gradients(loss, variables)`

# How do I calculate the gradients of a non-leaf variable w.r.t to a loss function?

**Abhai_Kollara**(Abhai Kollara) #1

**tom**(Thomas V) #2

Hello @Abhai_Kollara

Update:

The proper solution is to use `.retain_grad()`

```
v = torch.autograd.Variable(torch.randn(3), requires_grad=True)
v2 = v+1
v2.retain_grad()
v2.sum().backward()
v2.grad
```

Apparently, this is common enough.

This is what I had posted before I knew better:

how about using hooks, e.g.

```
v = torch.autograd.Variable(torch.randn(3), requires_grad=True)
def require_nonleaf_grad(v):
def hook(g):
v.grad_nonleaf = g
v.register_hook(hook)
v2 = v+1
require_nonleaf_grad(v2)
v2.sum().backward()
v2.grad_nonleaf
```

I don’t recommend calling it `.grad`

to not collide with pytorch internals.

Best regards

Thomas

Register_forward_hook function usage

Gradient penalty with respect to the network parameters

**amirid**#3

Hello @tom

I want to get the gradients of the discriminator network w.r.t the fake variable generated by the generator net. I tried both of your solutions, by using the first way I get None for the gradient value and by the second solution I get the following error:

AttributeError: ‘Tensor’ object has no attribute ‘grad_nonleaf’.

Do you have any idea of what’s wrong?

**tom**(Thomas V) #4

Most likely, your generator output doesn’t require gradients, maybe put it in eval or set requires_grad=False on its parameters.

Best regards

Thomas

**JoeHEZHAO**(Joe ) #5

Dear Thomas

Thanks for your posting. I am also encountering the ‘**optimize non-leaf variable**’ issue and I am grategul if you can provide some feedback to my practice.

What I want to do is using a linear combination (**learnable**) of basis filters for convolution operation and optimize **only the linear coefficients** as below:

```
class net(torch.nn.Module):
def __init__():
fitler_basis = [f_height, f_weight, num_filter];
self.coeff = [num_filter, c_in, c_out]; # To be optimized
self.conv2d = torch.nn.conv2d(c_in, c_out, strides=1, padding=1)
self.conv2d.weight = torch.matmul(filter_basis, coeff) # linear combinaton of pre-fixed filter as weights;
def forward(self, data):
return self.conv2d(data)
def train(self):
loss = ...
optim(self.coeff, loss)
```

How should I achieve my goal ?