RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 256, 1, 1]] is at version 3; expected version 2 instead

Hi all
I’ve been reading more about problems like mine, but I’m still not getting anywhere.
I have one modified code like the one below:

----------

Training

----------

for epoch in range(n_epochs+1):

for i,(imgs_A, gt_A, imgs_B,_) in enumerate(tqdm(train_loader, desc='Epoch: {}/{}'.format(epoch, n_epochs))):
  
    # Configure input
    imgs_A= imgs_A.to(device)
    imgs_B =imgs_B .to(device)
    gt_A = gt_A.to(device)

    # ----------------------
    #  Train Discriminators
    # ----------------------

    optimizer_D_A.zero_grad()
    optimizer_D_B.zero_grad()

    # Generate a batch of images
    fake_A = G_BA(imgs_B).detach()
    fake_B = G_AB(imgs_A).detach()

    # ----------
    # Domain A
    # ----------

    # Compute gradient penalty for improved wasserstein training
    gp_A = compute_gradient_penalty(D_A, imgs_A.data, fake_A.data)
    # Adversarial loss
    D_A_loss = -torch.mean(D_A(imgs_A)) + torch.mean(D_A(fake_A)) + lambda_gp * gp_A

    # ----------
    # Domain B
    # ----------

    # Compute gradient penalty for improved wasserstein training
    gp_B = compute_gradient_penalty(D_B, imgs_B.data, fake_B.data)
    # Adversarial loss
    D_B_loss = -torch.mean(D_B(imgs_B)) + torch.mean(D_B(fake_B)) + lambda_gp * gp_B

    # Total loss
    D_loss = D_A_loss + D_B_loss

    D_loss.backward()
    optimizer_D_A.step()
    optimizer_D_B.step()
    
    
    
    if i % n_critic == 0:

        # ------------------
        #  Train Generators
        # ------------------
        
        # Translate images to opposite domain
        fake_A = G_BA(imgs_B)
        fake_B = G_AB(imgs_A)
                    
        # Reconstruct images
        recov_A = G_BA(fake_B)
        recov_B = G_AB(fake_A)
        
        
        # Segmentaion loss 
        
        
        real_pred = SemSegModel(imgs_A)
        Seg_loss_Real  = SemSegcriterion(real_pred,gt_A)
        
        fake_pred = SemSegModel(fake_B)
        Seg_loss_Fake  = SemSegcriterion(fake_pred,gt_A)
        
        Seg_loss = Seg_loss_Real + Seg_loss_Fake
        SemSegopt.zero_grad()
        Seg_loss.backward(retain_graph=True)
        SemSegopt.step()
        

        # Adversarial loss
        G_adv = -torch.mean(D_A(fake_A)) - torch.mean(D_B(fake_B))
        # Cycle loss
        G_cycle = cycle_loss(recov_A, imgs_A) + cycle_loss(recov_B, imgs_B)
        # Total loss
        G_loss = lambda_adv * G_adv + lambda_cycle * G_cycle + lamda_seg * (Seg_loss_Real + Seg_loss_Fake)
        optimizer_G.zero_grad()
        G_loss.backward()
        optimizer_G.step()
        

Lr_scheduler_G.step()
Lr_scheduler_D_A.step()
Lr_scheduler_D_B.step()
Lr_scheduler_SemSegopt.step()

I got this erro after trying to run this code:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 256, 1, 1]] is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

How can i solve it?
any help!!

These errors are often raised by using retain_graph = True as a workaround for another issue. Could you explain why you are using it? If you are not sure and added it to avoid the “trying to backpropagate a second time…” error, check if you have forgotten to detach the computation graph to avoid trying to recompute gradients from previous iterations.

1 Like

Thank you for help
Actually, I added it to avoid the “trying to backpropagate a second time…”
How can I detach the computation graph? Which values do I have to detach from?

and, when I remove “retain_graph=True” the new error is raised:

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

Hi Sarmad, I wonder if you have any fix for this? I have encountered the same error on my model that was working fine before. I tried making some changes to the loss function and have encountered a similar stack trace report:
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!