Training with freeze weights

Trevor_Yap · September 30, 2023, 3:40pm

Hi all,

I have a model with freeze weights. I would like to train without updating them.
From what i saw in this post: how to freeze weight correctly, I can simply change the weight.require_grad to False and set the optimizer in the following ways:
optimizer = optim.Adam(filter(lambda p: p.requires_grad, net.parameters()), lr=0.1)

May I ask if I uses model.train()
Will this be affect the training and cause the freeze weights to update?? or i can safely run this?

soulitzer · September 30, 2023, 4:00pm

Setting eval/train for a module will not make it so that requires_grad=False weights will update. model.train()/ model.eval() actually toggles the behavior of certain modules like batch norm and drop out have different behavior, e.g. whether to update running mean/var or drop activations, so if you think you have any of those types of modules you will see a behavioral difference.

For more information, you can see Autograd mechanics — PyTorch 2.0 documentation

Trevor_Yap · September 30, 2023, 4:18pm

Thank you for your reply. Just to clarify so if i set optimizer = optim.Adam(filter(lambda p: p.requires_grad, net.parameters()), lr=0.1)
the Batch norm will not update right?

ptrblck · September 30, 2023, 5:54pm

If you’ve set the .requires_grad attribute of all batchnorm layers to False, you are correct. Otherwise refer to @soulitzer’s post explaining that the self.training attribute, toggled via model.train()/.eval() will not (un)freeze the trainable parameters.