Inplace operation I cannot find

hi,
I am implementing a network using resnet an some fancy stuff upon it, but I have an inplce operation on a variable in it, but can’t put my hand on it. Any idea where it is ?

class refinet(nn.Module):
def init(self,num_classes):
super(refinet, self).init()
resnet = models.resnet18(pretrained=True)

    self.layer1 = nn.Sequential(
        resnet.conv1,
        resnet.bn1,
        resnet.relu,
        resnet.maxpool,
        resnet.layer1
    )

    self.layer2 = resnet.layer2
    self.layer3 = resnet.layer3
    self.layer4 = resnet.layer4
    
    self.rcu1 = torch.nn.Sequential(
        nn.ReLU(inplace=True),
        conv3x3(512, 512, stride=1),
        nn.ReLU(inplace=True),
        conv3x3(512, 512, stride=1)
    )
    self.rcu2 = torch.nn.Sequential(
        nn.ReLU(inplace=True),
        conv3x3(256, 256, stride=1),
        nn.ReLU(inplace=True),
        conv3x3(256, 256, stride=1)
    )
    self.rcu3 = torch.nn.Sequential(
        nn.ReLU(inplace=True),
        conv3x3(128, 128, stride=1),
        nn.ReLU(inplace=True),
        conv3x3(128, 128, stride=1)
    )
    self.rcu4 = torch.nn.Sequential(
        nn.ReLU(inplace=True),
        conv3x3(64, 64, stride=1),
        nn.ReLU(inplace=True),
        conv3x3(64, 64, stride=1)
    )
    self.multires = torch.nn.Sequential(
        nn.Conv2d(512, 256, kernel_size=3,stride=1, padding=1, bias=False),
        nn.Upsample(scale_factor=2, mode='bilinear')
    )
    self.multires2 = torch.nn.Sequential(
        nn.Conv2d(256, 256, kernel_size=3,stride=1, padding=1, bias=False),
        nn.Upsample(scale_factor=2, mode='bilinear')
    )
    self.multires3 = torch.nn.Sequential(
        nn.Conv2d(256, 128, kernel_size=3,stride=1, padding=1, bias=False),
        nn.Upsample(scale_factor=2, mode='bilinear')
    )
    self.multires4 = torch.nn.Sequential(
        nn.Conv2d(128, 64, kernel_size=3,stride=1, padding=1, bias=False),
        nn.Upsample(scale_factor=2, mode='bilinear')
    )
    self.multires_end = torch.nn.Sequential(
        nn.Conv2d(64, 1,  kernel_size=3, stride=1, padding=1,bias=False),
        nn.Upsample(scale_factor=4, mode='bilinear')
    )
    self.res1 = torch.nn.Sequential(
        nn.MaxPool2d(kernel_size=5, stride=1, padding=2),
        nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=False)
    )
    self.res2 = torch.nn.Sequential(
        nn.MaxPool2d(kernel_size=5, stride=1, padding=2),
        nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1, bias=False)
    )
    self.res3 = torch.nn.Sequential(
        nn.MaxPool2d(kernel_size=5, stride=1, padding=2),
        nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1, bias=False)
    )
    self.res4 = torch.nn.Sequential(
        nn.MaxPool2d(kernel_size=5, stride=1, padding=2),
        nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1, bias=False)
    )
    self.relu = nn.ReLU(inplace=True)
    self.m = nn.Sigmoid()
    
def respool1(self,inputs):
    inputs = self.relu(inputs)
    for i in range(0,4):
        output = self.res1(inputs)
        inputs = inputs + output
    return inputs

def respool2(self,inputs):
    inputs = self.relu(inputs)
    for i in range(0,4):
        output = self.res2(inputs)
        inputs = inputs + output
    return inputs

def respool3(self,inputs):
    inputs = self.relu(inputs)
    for i in range(0,4):
        output = self.res3(inputs)
        inputs = inputs + output
    return inputs

def respool4(self,inputs):
    inputs = self.relu(inputs)
    for i in range(0,4):
        output = self.res4(inputs)
        inputs = inputs + output
    return inputs
    
    
def forward(self,x):
    x1 = self.layer1(x)
    x2 = self.layer2(x1)
    x3 = self.layer3(x2)
    x4 = self.layer4(x3)
    
    out = x4 + self.rcu1(x4) #firststep
    out = out + self.rcu1(out)
    out = self.respool1(out)
    out = out + self.rcu1(out)
    
    out2 = x3 + self.rcu2(x3)#secondstep
    out2 = out2 + self.rcu2(out2)
    out = self.multires(out) + out2
    out = self.respool2(out)
    out = out + self.rcu2(out)
    
    out2 = x2 + self.rcu3(x2)#thirdstep
    out2 = out2 + self.rcu3(out2)
    out = self.multires3(out) + out2
    out = self.respool3(out)
    out = out + self.rcu3(out)
    
    out2 = x1 + self.rcu4(x1)#thirdstep
    out2 = out2 + self.rcu4(out2)
    out = self.multires4(out) + out2
    out = self.respool4(out)
    out = out + self.rcu4(out)
    
    out = self.multires_end(out)
    out = self.m(out)
    
    return out

You have a lot of nn.ReLU(inplace=True) :wink: I don’t see any other inplace operation.

I thought about that but even without any relu inplace still the same error :confused:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

There are no other inplace operations here.
I would double check that I didn’t missed any inplace relu or some other code that does inplace operation, If that doesn’t solve it, a good way to debug this is to remove part of the net, and add a .sum().backward() in the middle, to check that everything above is correct, or use .detach() in the middle (that will prevent backprop from going above) to check that the bottom part of you net is correct. (Here top means where the input is, and bottom is where the output is :smiley: ).

mhh yeah good idea :slight_smile: thanks a lot :slight_smile:

I tried to detach, lets say this is my network:

out4 = x1 + self.rcu4(x1)#fourthstep
out4 += self.rcu4(out4)
out4 += out3
out4 = self.respool4(out4)
out4 += self.rcu4(out4)
out4 = self.multires_end(out4)
out4.detach()

    out4 = self.m(out4)
    return out4

I still have an error :

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Then it is possibly in your loss function? Or whatever you do with the output of this network?

I found at least one reason for the moment :

out4+= something

breaks the computational graph
while:

out4 = out4 + something

doesn’t

It depends where out4 is used. Changing data inplace is tricky because the original value is often needed to compute the backward pass for other operations.
In python += is an inplace operation, = is not.

this explains that :slight_smile: thanks for the help :slight_smile:

it’s the last Sigmoid rectification…

self.m = nn.Sigmoid()
: 
: 
out = self.m(out)

I have not found a solution to this. It seems that using a Sigmoid activation on the outpout layer (so that the output remains between 0 and 1) causes a problem with the auto differentiation during backward propagation.
Sigmoid() works just fine before the output layer! And there is no “inplace=False” flag!
Did you ever find a solution??

Did you find a solution to this? @raz1313