GAN Loss function

edshkim98 · April 27, 2021, 8:05pm

Hi, I have a question about gradient vanishing problem in GAN.
There are many articles saying gradient vanishing can occur when using BCE to train Generator if the discirminator network is too good. This is because the loss is small and there will be less gradient update. However, I don’t quite understand this phrase.
During the training of Generator, we use BCE(D(G(z),1) where D is Discriminator, G(z) is Generator and 1 is true. From this, if the discriminator network performs really well, surely the loss for the above equation be very large as discriminator network knows that the input is fake rather than real, but the target is 1. Then surely, the gradient update will be bigger and there is no gradient vanishing problem here?
Can somebody please explain to me whether I am correct and when is the case for gradient vanishing to happen during training generator? Thank you!