How to use zero_grad with GANs

In the pix2pix code that can be found here first the generator is trained and then the discriminator. My question is, aren’t the gradients calculated loss_D.backward() accumulated in the training of the generator in the next loop? Shouldn’t we use zero_grad on the discriminator again before training the generator?

loss_D.backward() wont accumulate any gradients for the generator parameters.

This is because of this line:

pred_fake = discriminator(fake_B.detach(), real_A)

The detach method prevents any gradient accumulation as it simply returns values.

Thanks for the reply. What I was worried about was actually the adversarial part. In the following part, the pixelwise loss and the loss from the discriminator is combined and backpropagation is called. Isn’t the generator also updated with the backpropagation path from the discriminator? But the gradients from the previous loop are not zeroed. I couldn’t understand thoroughly how the gradient accumulation works in this case.

        # GAN loss
        fake_B = generator(real_A)
        pred_fake = discriminator(fake_B, real_A)
        loss_GAN = criterion_GAN(pred_fake, valid)
        # Pixel-wise loss
        loss_pixel = criterion_pixelwise(fake_B, real_B)

        # Total loss
        loss_G = loss_GAN + lambda_pixel * loss_pixel

        loss_G.backward()

        optimizer_G.step()

Remember that the parameters are updated through the optimizer’s step method.
When the optimiser is initialised you pass it all the parameters it will modify.

So yes, there are still some uncleared gradients in the discriminator parameters, however:

loss_G.backward()
optimizer_G.step()

Wont be affected by them, as this will only modify the parameters of the generator based on its gradients wrt loss_G.