So, training a GAN model is broken into two steps:
-
Updating the Discriminator (while the generator is frozen)
-
Updating the generator (while the discriminator is frozen)
These two steps alternate at each step, and during the update-step for the discriminator, the generator is frozen, and similarly, during the update-step for the generator, the discriminator, the generator is frozen.
Since your question is about the generator, for computing the loss for training the generator, we feed the input (can be a latent vector, or sometimes an input image) to the generator, and we get output x_syn
.
For example, if G
is our generator, and it’s input is latent vector z
, then x_syn = G(z)
. Now, we feed the synthesized image (or also called fake image) x_syn
to the discriminator to get output = D(x_syn)
. But here D
is frozen so that we do not change the gradients of D
. Then, we compute the loss associated with this fake (synthesized) image: g_loss = criterion(output, labels)
, where labels
are the labels for real images. Since, we expect the generator to generate real-looking images.
Now, calling .backward
on g_loss
, will calculate the gradients of the generator network G
, since for computing g_loss
, the discriminator was frozen (as mentioned above). And these gradients will update the generator network.
Note that even if you do not freeze the discriminator, since only the parameters of the generator are passed to the optimizer optim_g
, calling optim_g.step()
will not affect the parameters of the discriminator. However, it is more efficient to freeze the discriminator nework when we intend to update the generator.