Usage of detach for GAN-based training

John1231983 · July 20, 2018, 3:03am

Hello all. I am training a model based on GAN for segmentation. For training the above architecture, I have to train two phase: D network and G network. During the train G network, I do not want to update parameters in D network. How should I do it?

The loss of G network likes

loss = loss_G + 0.5 * loss_D

            #Train G      
            pred = netG(images)           
            loss_S = criterionS(pred, targets)           
            D_pred = netG(F.softmax(pred, dim=1))
            optimizerS.zero_grad()
            loss_D = criterionD(D_pred, targets) 
            loss = loss_S + 0.3 * loss_D
            loss.backward()
            optimizerD.step()

The first solution is

            for param in netD.parameters():
                param.requires_grad = True

The second solution is loss_D = criterionD(D_pred.detach(), targets)

Which is correct?

aplassard · July 20, 2018, 2:39pm

Take a look at this example - https://github.com/devnag/pytorch-generative-adversarial-networks/blob/master/gan_pytorch.py

Basically, only tack steps on your generator optimizer when training the generator.

John1231983 · July 20, 2018, 10:48pm

Great link. I have a question. In my case my loss in generator is loss = loss_G + 0.5 * loss_D. It means combination between generator loss and discriminator loss but we only update params of generator. How should I do?

aplassard · July 21, 2018, 9:54pm

Just run optimezerD.step()

John1231983 · July 21, 2018, 11:37pm

Optimizer_G.step() ? Because we do not update D

aplassard · July 22, 2018, 1:29pm

Right, sorry about that. Make sure you zero out the gradients before taking another step too.