Stepping an optimizer with momentum when some variables have not been backpropagated to?

mattrobin · November 18, 2017, 8:20pm

When an optimizer is called on a set of variables on which some variables have not been backpropagated to, what happens? Does the optimizer not act on those variables at all? Or does it act as though they had zero loss? This is particularly important in the case of momentum based optimizers or the like.

For my specific case, I’m training a GAN. The generator is only trained every Nth step of training, while the discriminator is trained every step. I’d like to simplify my code a bit by using one optimizer (particular when saving and loading). If I don’t backpropagate to the generator, will the optimizer step mess with the momentum?

SimonW · November 18, 2017, 10:47pm

It will ignore variables with .grad=None, but not .grad=zeros_of_correct_shape.

mattrobin · November 18, 2017, 10:54pm

Just to confirm, if they have not been backpropagated to (since the previous optimizer step), then .grad=None, not zero?

Edit: Never mind, it looks like using zero_grad does set the .grad to zero, so something else would need to be done to ignore those variables.