When .grad is none?

Here’s another scenario where a variable that requires a gradient has no gradients even after backward().

x = Variable(torch.ones(2, 2), requires_grad=True)
y = x + 2
y.retain_grad()
out1 = x.mean()
out1.backward()
print(y.grad,y.requires_grad)
out2 = y.mean()
out2.backward()
print(x.grad,x.requires_grad)

Outputs are:

None True
tensor([[0.5000, 0.5000],
        [0.5000, 0.5000]]) True

The reason is simple: x -> y but not y -> x, therefore dx/dy=0 (or None) but dy/dx is non-zero.