I was trying to do backpropagation with a different loss value. The computation graph is still the same, the only difference is that I want to backpropagate a different loss value. I created this loss_1
which has the right computation graph but with a different loss value. In this simply linear regression setting y=ax + b
, the gradient on a
is different for the model_1
. What I want is 2 * (y_2 - y) * x
, which is different from model_1.linear1.weight.grad
. Looks to me that model_1.linear1.weight.grad
is the same as 2 * (y_1 - y) * x
. It seems that changing the loss value directly doesn’t change the gradients at all. The loss calculation is still using the old loss. Is there anyway to get around with this?
import torch
from torch.autograd import Variable
torch.manual_seed(10)
dtype = torch.FloatTensor
class TestNet(torch.nn.Module):
def __init__(self, D_in, D_out):
super(TestNet, self).__init__()
self.linear1 = torch.nn.Linear(D_in, D_out)
def forward(self, x):
return self.linear1(x)
N, D_in, D_out = 1, 1, 1
x = Variable(torch.randn(N, D_in), requires_grad=True)
y = Variable(torch.randn(N, D_out), requires_grad=False)
model_1 = TestNet(D_in, H, D_out)
model_2 = TestNet(D_in, H, D_out)
criterion = torch.nn.MSELoss(size_average=False)
optimizer_1 = torch.optim.SGD(model_1.parameters(), lr=1e-4)
optimizer_2 = torch.optim.SGD(model_2.parameters(), lr=1e-4)
x.grad.data.zero_()
y_2 = model_2(x)
loss_2 = criterion(y_2, y)
optimizer_2.zero_grad()
loss_2.backward()
x_grad_2 = x.grad.clone()
y_1 = model_1(x)
loss_1 = criterion(y_1, y)
loss_1.data.fill_(loss_2.data[0])
optimizer_1.zero_grad()
x.grad.data.zero_()
loss_1.backward()
x_grad_1 = x.grad.clone()