Enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True)

saba · September 3, 2020, 7:52am

one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 1, 7, 7]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Hi ,
Would you please help me with this error ? it occur in line (errG.backward() in the generator part). The “[64, 1, 7, 7]” refer to “fakefinal in section creating final fake” because before my code worked the only thing that I change is fake part.I read some comments but I couldn’t understand what should I do exactly.

for epoch in range(num_epochs):

    for pos, neg in zip(trainloader,trainloaderNeg):
   
        ############################
        # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
        ###########################
        ## Train with all-real batch
        netD.zero_grad()
        
        real_cpu = images1.to(device)
        
        b_size = real_cpu.size(0)
        label = torch.full((b_size,), real_label, device=device)

        label=label.to(device)

        netD=netD.float()
        output = netD(real_cpu).view(-1)

## ------------ Accuracty of the Dis----------------

        errD_real = criterion(output, label)
        
        # Calculate gradients for D in backward pass
        errD_real.backward()
        D_x = output.mean().item()
        
        noise = torch.randn(b_size, nz, 1, 1, device=device)    
      
        # Generate fake image batch with G
        netG=netG.float()
        fake = netG(noise).to(device)
  
## ------------- creating final fake ---------------
        fakefinal=Negpach
        Negpach33=Negpach[:,:,5-ff:5+ff+1,5-ff:5+ff+1]
        fake44=torch.mul(fake,Negpach33)
        fakefinal[:,:,5-ff:5+ff+1,5-ff:5+ff+1]=fake44

 ## --------------------------------     
        label.fill_(fake_label)
        label=label.to(device)
        # Classify all fake batch with D
        output = netD(fakefinal.detach()).view(-1)
      
        # Calculate D's loss on the all-fake batch
        errD_fake = criterion(output, label)
        errD_fake.backward()
        D_G_z1 = output.mean().item()

        errD = errD_real + errD_fake
        # Update D
        if Counter11%ItrD==0:
             optimizerD.step()
        
        ############################
        # (2) Update G network: maximize log(D(G(z)))
        ###########################
        netG.zero_grad()
        label.fill_(real_label)  # fake labels are real for generator cost

        label=label.to(device)
        output = netD(fakefinal).view(-1)
        errG = criterion(output, label)
        # Calculate gradients for G
        errG.backward()
        # Update G
        optimizerG.step()

albanD · September 3, 2020, 3:10pm

Hi,

Have you tried enabling anomaly detection as proposed in the error?
What does the anomaly detection warning you get before the error point at?

saba · September 3, 2020, 11:21pm

Many thanks for your answer.where I should add torch.autograd.set_detect_anomaly(True)?

albanD · September 4, 2020, 1:37pm

It sets a global flag. So you just need to enable it before you start doing the forward pass.

saba · September 6, 2020, 10:12pm

Appreciate your help