The transfer learning tutorial has a relevant section:
To summarize:
-
Set
requires_grad=False
for all parameters you do not wish to optimizer. This avoids computing gradients for them:for param in base_model.parameters(): param.requires_grad = False
-
Call
.parameters()
on the part of the sub-network:optim.Adam(model.sub_network.parameters(), ...)
If your new layers aren’t entirely contained in a single Module, you can collect parameters by using list concatenation:
parameters = []
parameters.extend(model.new_layer1.parameters())
parameters.extend(model.new)layer2.parameters())
optimizer = optim.Adam(parameters, ...)