Why backward method doesn't work on .sum() method?

TheShadow29 · March 9, 2018, 7:55am

Demo code:

import torch
import torch.autograd.Variable as Variable
a1 = Variable(torch.rand(4,5))
lin = torch.nn.Linear(5, 6)
a2 = lin(a1)
l1 = a2.sum()
print(a2.grad)
l1.backward()
print(a2.grad)

Why does a2.grad remain None in both cases?

jpeg729 · March 10, 2018, 11:17am

Because a2 is the result of a calculation. PyTorch doesn’t expose grads for Variables that are created by a calculation.

TheShadow29 · March 10, 2018, 5:51pm

So what should I use to expose the gradients created by calculation?

colesbury · March 10, 2018, 7:50pm

Either of these works:

a2_grad = torch.autograd.grad(l1, a2)

or

a2.retain_grad()
...
l1.backward()
print(a2.grad)