In the DCGAN example, we will train the D network first with the fake and real label. After that, we will train the G network, and do not update the D network. Why didn’t the DCGAN code set requires_grad = False when training the G network? This is what I think
for param in D.parameters():
param.requires_grad = True
# train with real
...
# train with fake
...
#-------Update G network------------
# Do not update D network
for param in D.parameters():
param.requires_grad = False
netG.zero_grad()
output = netD(fake)
errG = criterion(output, label)
errG.backward()
optimizerG.step()
If the data which been fed into D is no longer a Variable but a plain tensor, then if you do backward() on the output of D, the computation in the network D will not influence the grad of G. So there is no need to do required_grad = False, the detach already done this.
Great. So during training G, If i want to update G network only and fed forward data to D network, I still need to set param.requires_grad = False. Am I right?
Note that, this is different with the above case, in which during training G, I also want to get a prediction from D network. The code like
for param in D.parameters():
param.requires_grad = False
netG.zero_grad()
output = netD(fake)
output2= netG(fake) #one more here
errG = criterion(output, label)
errD = criterion(output2, label)
errG.backward()
optimizerG.step()
No you don’t have to do that. In this example you are using two different optimizers (one per network). They only get parameters of a certain network to optimize. If optimizing G there is no need for setting param.requires_grad = False as long as you don’t call optimizer_D.step()
I’m confused about your code, why you fed the fake image to both G and D? The fake is the ouput of netG(noise). And, if you are training G, and you also want to get a prediction from D, why not JUST PRINT IT OUT, the prediction of D is needed when training for Gnet. Here is the code from example of G training:
netG.zero_grad()
label.fill_(real_label)
output = netD(fake)
print(output) # here , if you want to print it, just print it.
# the whole code just train G because it only
# does optimizerG.step()
errG = criterion(output, label)
errG.backward()
D_G_z2 = output.mean().item()
optimizerG.step()
I fed data to G to check if the G network is trained or not. I know the normal dcgan just train G and keep D fixed. But I just want to check if during training G, does the network D and G trained nor nor