A weird question about `+=` operation when changing model input

Hi~
I came across a weird issue these days, and the minimal code to reproduce is as below:

import torch
import torch.nn as nn
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 3, kernel_size=3, stride=1, padding=1) 

    def forward(self, x):
        return self.conv1(x)

model = Net()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
model.train()

tot = 0
input = torch.rand(4, 3, 32, 32)
for _ in range(2):
        out = model(input)
        tot += out
        
        delta = torch.rand(out.shape)
        
        #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
        input += delta            # ERROR
        # input = input + delta   # CORRECT
        #<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

with torch.autograd.set_detect_anomaly(True):
    loss = tot.mean()
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

When I change the input of the model using += operation, it give me a runtime error when using with torch.autograd.set_detect_anomaly(True) at the loss.backward() line:

“RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 3, 32, 32]] is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!”

This was so confused until I seperate += to + and =. I’m still wondering, is there any differences to use these two method in this situation? As they are all model input without requiring grad. (I’m now using 1.5.1 version)

Thanks in advance!

That makes no difference, inputs are used in the first layer in some operation that works like: layer1(input, weights1). That operation is normally multiplication-based, so weights1 gradient computation depends on input, and inplace += op rewrites it

1 Like

Thanks! That’s clear.