Fine tune only parameters in optimizer

shaunlipy · January 18, 2020, 11:34pm

Hi, I am wondering when fine tune only part of parameters, what would happen if we only append parameters (to be trained) in the optimizer, but don’t set other parameters’ parameter.requires_grad = False.

Normally we would have:

model.feature1.requires_grad = False # Not training
model.feature2.requires_grad = True # to be trained

optimizer = Adam(model.feature2.parameters(), ...)

However, what if I do:

model.feature1.requires_grad = True # Not training
model.feature2.requires_grad = True # to be trained

optimizer = Adam(model.feature2.parameters(), ...)

Would this be ok?

ptrblck · January 20, 2020, 1:08am

We had a similar discussion here.
I would recommend to explicitly define how your training routine should work and thus would recommend to first approach.

shaunlipy · January 20, 2020, 2:29am

I see. Yeah, I saw your reply elsewhere: the “edge” case when having weight decay, if we append all parameters into the optimizer. But not sure if certain parameters are not included in the optimizer at all, would the “edge” case still happen. Thanks

ptrblck · January 20, 2020, 2:31am

If your current optimizer doesn’t have access to some parameters, it won’t be able to update them.
However, gradients would still be accumulated in these “unused” parameters, so if you decide after a while to create a new optimizer and pass all parameters, your training might blow up.