Difference between set parameter requires_grad=False and exclude them from optimizer?

I want to freeze some layers (or sub model ), there are two methods:

    • set requires_grad =False
for v in  sub_model.parameters():
     v.requires_grad = False 
optimizer = torch.optim.Adam( self.model.parameters() )
    • do not include these parameters to optimizer
optimizer = Adam([ v for k,v in self.model.named_parameters() if k not in self.sub_model.named_parameters() ] )

What is the difference and which should I use ?
Thank you so much.

I would try to write the code as explicit as possible and filter the unneeded parameters out as described also e.g. here. This would make sure that the optimizer does not have any references to these frozen parameters and will thus never update them.
The first approach could also work, but be aware that the optimizer could still update frozen parameters if it has running stats (e.g. as is the case for Adam) for these parameters from previous iterations.

1 Like