I come across some issues with getting the gradient with respect to the input in pytorch 0.4. In older version I would do something as in this post
but in pytorch0.4, since variable is deprecated, the following code will not work.
In [1]: import torch
In [2]: inp = torch.rand(3, requires_grad=True).to(0)
In [3]: l = torch.nn.Linear(3,4).to(0)
In [4]: o = l(inp)
In [5]: o.backward(torch.ones_like(o).to(0))
In [6]: list(l.parameters())[0].grad
Out[6]:
tensor([[ 0.0948, 0.7947, 0.0521],
[ 0.0948, 0.7947, 0.0521],
[ 0.0948, 0.7947, 0.0521],
[ 0.0948, 0.7947, 0.0521]], device='cuda:0')
In [7]: inp.grad
In [8]:
However, the following code will work and I am confused, has anyone got an answer to this behavior in pytorch0.4?
In [1]: import torch
In [2]: inp = torch.rand(3, requires_grad=True)
In [3]: l = torch.nn.Linear(3,4).to(0)
In [4]: o = l(inp.to(0))
In [5]: o.backward(torch.ones_like(o).to(0))
In [6]: inp.grad
Out[6]: tensor([-0.7010, -0.2453, -1.1455])
Thanks for your prompt help @SimonW , can you elaborate a bit about this, i don’t quite understand what it means:
rand_result[leaf, requires_grad=True] ==> inp[intermediate result from rand_result.to()] ==> ...
I guess what you mean is in the first example, torch.rand() returns a leaf, after to(), inp is an intermediate. Seems similar to Variable(Tensor, requires_grad=True).cuda().