Dual Loss Calculation Failed. RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

zaber666 · July 6, 2022, 8:26pm

So, What i am basically trying to do can be summarized below.

I want to reconstruct the parameter values of the weight of a convolutional layer and want to optimize the network on both this reconstruction loss and categorial cross-entropy loss. It works fine only while optimizing on cross-entropy loss but fails when I add the reconstruction loss. Basically the gradient is not passing through the NN(Neural Net) to update it’s parameter. I want it’s weight to get updated.

The network that I build was like below,

class MODEL(nn.Module):
    def __init__(self):
        super(MODEL, self).__init__()
        
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=(3,3), padding=1)
        self.max_pool1 = nn.MaxPool2d(kernel_size=(3,3))
        self.reBuilder = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=(5,5), padding=2)
        self.lossfn = torch.nn.L1Loss(reduction="mean")
        
        self.conv2 = nn.Conv2d(in_channels=64, out_channels=256, kernel_size=(3,3), padding=1)
        self.max_pool2 = nn.MaxPool2d(kernel_size=(3,3))
        self.conv3 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=(3,3), padding=1)
        self.max_pool3 = nn.MaxPool2d(kernel_size=(3,3))
        
        self.dense = nn.Linear(in_features=512, out_features=10)
        
        
    def forward(self, x):
        ckpt = self.state_dict()
        
        weight = ckpt["conv1.weight"].detach().clone().permute((3,2,1,0)) #output_shape (3,3,3,64)
        weightNew = self.reBuilder(weight)
        loss = self.lossfn(weightNew, weight)

        weightNew = weightNew.permute((3, 2, 1, 0))
        weightFinal = weightNew.detach().clone()

        ckpt["conv1.weight"] = weightFinal
        self.load_state_dict(ckpt)
        
        x = self.conv1(x)
        x = self.max_pool1(x)
        x = self.conv2(x)
        x = self.max_pool2(x)
        x = self.conv3(x)
        x = self.max_pool3(x)
        x = torch.flatten(x, start_dim=1)
        x = self.dense(x)
        
        return x, loss

The forward and backward pass code is like,

model = MODEL()
criterion = nn.BCEWithLogitsLoss().to("cuda")

outputs, recoLoss = model(images)
ceLoss = criterion(outputs, targets)
loss = ceLoss + recoLoss
loss.backward()
optimizer.step()

If loss contains only ceLoss it works. But when I add recoLoss term it gives the following error.

"RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3, 3, 5, 5]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True)."

basically it is saying that, the parameters of the NN(in picture) or self.reBuilder in code have been modified already.
Can anyone help me in this regard?

AlphaBetaGamma96 · July 6, 2022, 9:38pm

zaber666:

        weight = ckpt["conv1.weight"].detach().clone().permute((3,2,1,0)) #output_shape (3,3,3,64)
        weightNew = self.reBuilder(weight)
        loss = self.lossfn(weightNew, weight)

        weightNew = weightNew.permute((3, 2, 1, 0))
        weightFinal = weightNew.detach().clone()

Can you replace the .detach().clone() with just .clone().

Also, if you load a state_dict during the forward pass surely that would overwrite all your weights? (And probably cause an in-place error as I think load_state_dict works in-place)