Demo code:
import torch
import torch.autograd.Variable as Variable
a1 = Variable(torch.rand(4,5))
lin = torch.nn.Linear(5, 6)
a2 = lin(a1)
l1 = a2.sum()
print(a2.grad)
l1.backward()
print(a2.grad)
Why does a2.grad remain None in both cases?
Demo code:
import torch
import torch.autograd.Variable as Variable
a1 = Variable(torch.rand(4,5))
lin = torch.nn.Linear(5, 6)
a2 = lin(a1)
l1 = a2.sum()
print(a2.grad)
l1.backward()
print(a2.grad)
Why does a2.grad remain None in both cases?
Because a2
is the result of a calculation. PyTorch doesn’t expose grads for Variables that are created by a calculation.
So what should I use to expose the gradients created by calculation?
Either of these works:
a2_grad = torch.autograd.grad(l1, a2)
or
a2.retain_grad()
...
l1.backward()
print(a2.grad)