Why backward method doesn't work on .sum() method?

Demo code:

import torch
import torch.autograd.Variable as Variable
a1 = Variable(torch.rand(4,5))
lin = torch.nn.Linear(5, 6)
a2 = lin(a1)
l1 = a2.sum()
print(a2.grad)
l1.backward()
print(a2.grad)

Why does a2.grad remain None in both cases?

Because a2 is the result of a calculation. PyTorch doesn’t expose grads for Variables that are created by a calculation.

So what should I use to expose the gradients created by calculation?

Either of these works:

a2_grad = torch.autograd.grad(l1, a2)

or

a2.retain_grad()
...
l1.backward()
print(a2.grad)
2 Likes