Stepping an optimizer with momentum when some variables have not been backpropagated to?

When an optimizer is called on a set of variables on which some variables have not been backpropagated to, what happens? Does the optimizer not act on those variables at all? Or does it act as though they had zero loss? This is particularly important in the case of momentum based optimizers or the like.

For my specific case, I’m training a GAN. The generator is only trained every Nth step of training, while the discriminator is trained every step. I’d like to simplify my code a bit by using one optimizer (particular when saving and loading). If I don’t backpropagate to the generator, will the optimizer step mess with the momentum?

It will ignore variables with .grad=None, but not .grad=zeros_of_correct_shape.

Just to confirm, if they have not been backpropagated to (since the previous optimizer step), then .grad=None, not zero?

Edit: Never mind, it looks like using zero_grad does set the .grad to zero, so something else would need to be done to ignore those variables.

1 Like