Input cannot get gradient in pytorch 0.4

ZhengRui · June 4, 2018, 8:05pm

I come across some issues with getting the gradient with respect to the input in pytorch 0.4. In older version I would do something as in this post

but in pytorch0.4, since variable is deprecated, the following code will not work.

In [1]: import torch

In [2]: inp = torch.rand(3, requires_grad=True).to(0)

In [3]: l = torch.nn.Linear(3,4).to(0)

In [4]: o = l(inp)

In [5]: o.backward(torch.ones_like(o).to(0))

In [6]: list(l.parameters())[0].grad
Out[6]: 
tensor([[ 0.0948,  0.7947,  0.0521],
        [ 0.0948,  0.7947,  0.0521],
        [ 0.0948,  0.7947,  0.0521],
        [ 0.0948,  0.7947,  0.0521]], device='cuda:0')

In [7]: inp.grad

In [8]:

However, the following code will work and I am confused, has anyone got an answer to this behavior in pytorch0.4?

In [1]: import torch

In [2]: inp = torch.rand(3, requires_grad=True)

In [3]: l = torch.nn.Linear(3,4).to(0)

In [4]: o = l(inp.to(0))

In [5]: o.backward(torch.ones_like(o).to(0))

In [6]: inp.grad
Out[6]: tensor([-0.7010, -0.2453, -1.1455])

SimonW · June 4, 2018, 8:23pm

Here is your problem. Your graph of the 1st example is

rand_result[leaf, requires_grad=True] ==> inp[intermediate result from rand_result.to()] ==> ...

And intermediate result do not keep gradient by default.

You should instead do

device = torch.device('cuda')
inp = torch.rand(3, device=device, requires_grad=True)

ZhengRui · June 5, 2018, 12:10am

Thanks for your prompt help @SimonW , can you elaborate a bit about this, i don’t quite understand what it means:

rand_result[leaf, requires_grad=True] ==> inp[intermediate result from rand_result.to()] ==> ...

I guess what you mean is in the first example, torch.rand() returns a leaf, after to(), inp is an intermediate. Seems similar to Variable(Tensor, requires_grad=True).cuda().

SimonW · June 5, 2018, 4:01pm

yes your understanding is correct.