I want to get the gradient of output w.r.t. model parameters. For example, y = xW where x is a vector of size 1x5, and W is a vector of size 5x1, and W is model parameter. I don’t have a loss function, just want to get the gradient of y w.r.t W. Obviously, in this case, dy/dW should be x. However, using pytorch backward, the value I got is wrong. What’s the reason? Here is my code.
import torch import numpy as np import torch.nn as nn import torch.nn.functional as torchF from torch.autograd import Variable class Net(nn.Module): def __init__(self, dim_action=1): # assumes a [256,256,3] input super(Net, self).__init__() self.fc = nn.Linear(5,1, bias=False) def forward(self,x): x = self.fc(x) return x net = Net() net.double() x = Variable(torch.from_numpy(np.array([1.0,2.0,3.0,4.0,5.0]).reshape((1,5)))) out = net(x) outvalue = np.array(out.data) net.zero_grad() y = torch.randn(1,1) y.data = outvalue y = y.double() out.backward(y) for f in net.parameters(): print('data is') print(f.data) print('grad is') print(f.grad)
and the output is,
-0.3839 -0.1894 -0.3376 -0.3934 -0.2313 [torch.DoubleTensor of size 1x5] grad is Variable containing: 0.2647 0.5293 0.7940 1.0587 1.3233 [torch.DoubleTensor of size 1x5]