Specifying model parameters in the optimizer?

Saewon_Yang · February 23, 2019, 7:52am

def get_optim_policy(self):
params = [
{‘params’: self.backbone.parameters()},
{‘params’: self.res_part.parameters()},
{‘params’: self.global_reduction.parameters()},
{‘params’: self.global_softmax.parameters()},
{‘params’: self.res_part2.parameters()},
{‘params’: self.reduction.parameters()},
{‘params’: self.softmax.parameters()},
]
return params

optim_policy = model.get_optim_policy()
optimizer = torch.optim.SGD(optim_policy, lr=learning_rate, momentum= 0.9, weight_decay=5e-4)

I saw this kind of code in some open source. and found out that optim policy is specified in the torch.optim.SGD.

but I don’t understand why this kind of job is done. below code is what I’m familiar with. can someone tell me the difference between the two code? Is there a case when more specified optim_policy is needed?

    optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, momentum= 0.9, weight_decay=5e-4)

gfrogat · February 23, 2019, 9:08am

What you see here is just how to use the per-parameter options in torch.optim.

You can use this for example for specifying different learning rates per layer (because you might not find one learning rate that works for all layers).
It’s also possible to only pass a subset of the models parameters that should be optimized (e.g. you want to do transfer learning and only update the parameters in the final layers).