Why not set `param.requires_grad = False` in D network when training G network in DCGAN?

In the DCGAN example, we will train the D network first with the fake and real label. After that, we will train the G network, and do not update the D network. Why didn’t the DCGAN code set requires_grad = False when training the G network? This is what I think

        for param in D.parameters():
             param.requires_grad = True
        # train with real
        ...
        # train with fake  
        ...
        #-------Update G network------------
        # Do not update D network
        for param in D.parameters():
             param.requires_grad = False
        netG.zero_grad()       
        output = netD(fake)
        errG = criterion(output, label)
        errG.backward()
        optimizerG.step()

when training D, the input have been detached, it a data only not a Variable anymore.

But my question is training G.

Yes of course,

If the data which been fed into D is no longer a Variable but a plain tensor, then if you do backward() on the output of D, the computation in the network D will not influence the grad of G. So there is no need to do required_grad = False, the detach already done this.

1 Like

Great. So during training G, If i want to update G network only and fed forward data to D network, I still need to set param.requires_grad = False. Am I right?

Note that, this is different with the above case, in which during training G, I also want to get a prediction from D network. The code like

for param in D.parameters():
             param.requires_grad = False
        netG.zero_grad()       
        output = netD(fake)
        output2= netG(fake) #one more here
        errG = criterion(output, label)
        errD = criterion(output2, label)
        errG.backward()
        optimizerG.step()

No you don’t have to do that. In this example you are using two different optimizers (one per network). They only get parameters of a certain network to optimize. If optimizing G there is no need for setting param.requires_grad = False as long as you don’t call optimizer_D.step()

I’m confused about your code, why you fed the fake image to both G and D? The fake is the ouput of netG(noise). And, if you are training G, and you also want to get a prediction from D, why not JUST PRINT IT OUT, the prediction of D is needed when training for Gnet. Here is the code from example of G training:

        netG.zero_grad()
        label.fill_(real_label)
        output = netD(fake)
        print(output)        # here , if you want to print it, just print it. 
                             # the whole code just train G because it only 
                             # does optimizerG.step()
        errG = criterion(output, label)
        errG.backward()
        D_G_z2 = output.mean().item()
        optimizerG.step()

I fed data to G to check if the G network is trained or not. I know the normal dcgan just train G and keep D fixed. But I just want to check if during training G, does the network D and G trained nor nor