When an optimizer is called on a set of variables on which some variables have not been backpropagated to, what happens? Does the optimizer not act on those variables at all? Or does it act as though they had zero loss? This is particularly important in the case of momentum based optimizers or the like.
For my specific case, I’m training a GAN. The generator is only trained every Nth step of training, while the discriminator is trained every step. I’d like to simplify my code a bit by using one optimizer (particular when saving and loading). If I don’t backpropagate to the generator, will the optimizer step mess with the momentum?