Trying to backward through the graph a second time with multiple losses

machina · September 28, 2021, 3:49pm

Hello! I’m trying to adjust a generator’s parameters to maximize the target model: “f”'s cross-entropy loss, and minimize the discriminator’s loss: “loss G”

for the Discriminator, I’m trying to adjust its parameters to minimize lossD

In the process, I’m getting an error from using backward twice.

Here is the code block:

        # Train Generator: min log(1 - D(G(z))) <-> max log(D(G(z))
        
        adv_ex = adv_ex.reshape(32, 28*28)
        output = disc(adv_ex) #discriminator decides if advex is real or fake
        lossG = torch.mean(torch.log(1. - output)) #get loss for gen's desired desc pred

        adv_ex = adv_ex.reshape(-1,1,28,28)
        f_pred = target(adv_ex) #.size() = [32, 10]
        f_loss = -CE_loss(f_pred, labels) #add loss for gens desired f pred
        loss_G_Final = f_loss+lossG # can change the weight of this loss term later
        
        opt_gen.zero_grad()
        loss_G_Final = loss_G_Final.to(device)
        loss_G_Final.backward()
        opt_gen.step()
        
        # Train Discriminator: max log(D(x)) + log(1 - D(G(z)))
        
        adv_ex = adv_ex.reshape(32, 784)
        disc_real = disc(real).view(-1)
        disc_fake = disc(adv_ex).view(-1)
        lossD = -torch.mean(torch.log(disc(real)) + torch.log(1. - disc(adv_ex)))
        # can decide later how much that loss term weighs
        
        opt_disc.zero_grad()
        lossD.backward()
        opt_disc.step()

Here is the error code:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_13305/405152453.py in <module>
     45 
     46         opt_disc.zero_grad()
---> 47         lossD.backward()
     48         opt_disc.step()
     49 

~/.conda/envs/mypytorch19/lib/python3.9/site-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
    253                 create_graph=create_graph,
    254                 inputs=inputs)
--> 255         torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
    256 
    257     def register_hook(self, hook):

~/.conda/envs/mypytorch19/lib/python3.9/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    145         retain_graph = create_graph
    146 
--> 147     Variable._execution_engine.run_backward(
    148         tensors, grad_tensors_, retain_graph, create_graph, inputs,
    149         allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag

RuntimeError: Trying to backward through the graph a second time (or directly access saved variables after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved variables after calling backward.

Scott_Hoang · September 28, 2021, 6:17pm

maybe this should be:
disc_fake = disc(adv_ex.clone().detach()).view(-1)
Since I’m assuming this it the output of your generator?

machina · September 28, 2021, 7:29pm

[W python_anomaly_mode.cpp:85] Warning: Error detected in MmBackward. No forward pass information available. Enable detect anomaly during forward pass for more information. (function _print_stack)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_28136/1294481728.py in <module>
     46 
     47             opt_disc.zero_grad()
---> 48             lossD.backward()
     49             opt_disc.step()
     50 

~/.conda/envs/mypytorch19/lib/python3.9/site-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
    253                 create_graph=create_graph,
    254                 inputs=inputs)
--> 255         torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
    256 
    257     def register_hook(self, hook):

~/.conda/envs/mypytorch19/lib/python3.9/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    145         retain_graph = create_graph
    146 
--> 147     Variable._execution_engine.run_backward(
    148         tensors, grad_tensors_, retain_graph, create_graph, inputs,
    149         allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 784]], which is output 0 of TBackward, is at version 9; expected version 8 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Scott_Hoang · September 28, 2021, 8:19pm

I see. We solved the first issue. this second issue in unrelated, and have to do with your implementation of Discriminator. Check your activation function inplace operator. Set them to False.

machina · September 28, 2021, 8:41pm

What is the activation function inplace operator? and how do I set it to false?

AlphaBetaGamma96 · September 28, 2021, 9:49pm

Could you try changing loss_G_Final.backward() to loss_G_Final.backward(retain_graph=True)?

Also, I don’t think you should be changing the device of your loss before calling backward()? If you initialize your optimizer with your model on the CPU and then move it to the GPU, there might be a potential device error. With your parameters being on one device and the gradients another.

Scott_Hoang · September 29, 2021, 4:33am

Essentially, you have an operator in your discriminator that changes the value of one of your needed matrices “in place” (i.e. at the same memory pointer instead of making a copy). Check Your Relu operator.
If u see nn.Relu(True) set it to nn.Relu(False).