Can't get gradient from cloned variables

Yiu_Lau · February 7, 2018, 1:39am

import torch
from torch.autograd import Variable
q = Variable(torch.randn(2),requires_grad=True)
q_prime = q.clone()
x = torch.dot(q,q)
y = torch.dot(q_prime,q_prime)
x.backward()
y.backward()
print(q.grad)
print(q_prime.grad)

Output:

Variable containing:
6.2226
0.3911
[torch.FloatTensor of size 2]
None

Why can’t I get the gradient from the cloned variable?

SimonW · February 7, 2018, 3:07am

Because clone is also an edge in the computation graph. By default intermediate nodes are not retaining gradient. In your case the gradient is eventually accumulated to q.grad. If you want q_prime to retain gradient, you need to call q_prime.retain_grad()

Jian_Hui · March 28, 2018, 6:59am

Your answer is really helpful!
Thanks!