When training GANs, a ‘bad’ Discriminator D is the one that correctly classifies real and fake, generated by Generator G data, yielding small derivatives. Therefore, a D that outputs values ~0.5 reagrdless of the input is supposed to be ‘good’, yielding large derivatives.
In one of the models I’m training an error D(1-G(z)) quickly reduces to ~0.69 (i.e. -log(0.5)), which should be good, because we are fooling D, but G actually doesn’t learn anything from it quickly converging to outputting all 0s (for example).
- What is the reason for this convergence?
- Why doesn’t G learn from the errors?