Variable grad is always None when extending autograd

Yeah it looks like what’s happening is that x = Variable(torch.zeros(…), requires_grad=True).cuda() creates an intermediate Variable y = Variable(torch.zeros(...), requires_grad=True) and then assigns x = y.cuda().

Since y is the leaf node, the gradients only accumulate in y and not x.

6 Likes