Method grad returns None for a tensor

aktsvigun · August 4, 2020, 4:30pm

Hi, I am interested in gradients for some parameters after applying a loss function. Hence, I do as follows:

x = torch.autograd.Variable(torch.Tensor(nna.scalar_dot), requires_grad=True)
w = torch.autograd.Variable(torch.Tensor(nna.W_out), requires_grad=True)
out = torch.matmul(x, w)

loss_fn = nn.BCEWithLogitsLoss()
loss = loss_fn(out, y_tensor)
loss.backward(retain_graph=True)

out.grad --> returns None

However, if I detach the tensor out and have it as torch.autograd.Variable(torch.Tensor(torch.matmul(x, w)), requires_grad=True) , I get the correct result for it, however, x.grad now turns None (which is understandable as we “unpin” it).

Could someone explain me, why out.grad is None and how can I get it? Thanks in advance!

colesbury · August 4, 2020, 4:39pm

Use out.retain_grad().

PyTorch by default only saves the gradients for the initial variables x and w (the “leaf” variables) that have requires_grad=True set – not for intermediate outputs like out.

To save the gradient for out, use the retain_grad method

out = torch.matmul(x, w)
out.retain_grad()

aktsvigun · August 4, 2020, 4:41pm

Sam, thanks a lot! Works perfectly.