After 1.4->1.5 how to do backward() twice

Landu · May 17, 2020, 9:15am

hello i found that 1.5 upgrade has change of autograd
and also there is a tutorial for me who doesn’t know what’s wrong(https://github.com/pytorch/pytorch/releases) issue:[torch.optim optimizers changed to fix in-place checks for the changes made by the optimizer]

def model(input, target, param):
    return `(input * param ** 2 - target).norm()`

param = torch.randn(2, requires_grad=True)
input = torch.randn(2)
target = torch.randn(2)
sgd = optim.SGD([param], lr=0.001)
loss = model(input, target, param)
loss.backward(retain_graph=True)
sgd.step()
loss.backward()
param.grad

before 1.5, ↑ codes works. but after 1.5

def model(input, target, param):
    return (input * param ** 2 - target).norm()

param = torch.randn(2, requires_grad=True)
input = torch.randn(2)
target = torch.randn(2)
sgd = optim.SGD([param], lr=0.001)
loss = model(input, target, param.clone())
loss.backward(retain_graph=True)
sgd.step()
loss.backward()
param.grad

i have to put param.clone() into model

but in reality, i don’t use a model like above, i just put only inputs into my model.

class test1(nn.Module):
    def __init__(self):
        super(test1, self).__init__()
        self.layer1 = nn.Linear(10, 1)

    def forward(self, x):
        x = self.layer1(x)
        return x

t = test1().to(device)

optimizer1 = torch.optim.Adam(t.parameters(), lr=0.001)

for i, (images, labels) in enumerate(data_loader):
    images = images.view(batch_size, -1)[:, :10].to(device)
    labels = labels.float().to(device)

    a = t(images)
    loss = criterion(a, labels)

    optimizer1.zero_grad()
    loss.backward(retain_graph=True)
    optimizer1.step()
    loss.backward()

and ↑ this is my test code.

how can i change ↑ this code?

albanD · May 18, 2020, 9:51pm

Hi,

The problem is that the original code here was computing wrong gradients.

You can modify this quite easily by overriding the linear forward function for this case:

class MyLinear(nn.Linear):
    def forward(self, input):
        return F.linear(input, self.weight.clone(), self.bias.clone())

# And use this one later:
self.layer1 = nn.MyLinear(10, 1)