Different Learning Rates within a Model

How would I apply a different learning rate to different portions of a model? Would it be as simple as creating two optimizers with different sets of model parameters and calling optimizer.step() on both for each batch?

3 Likes

Check the Per-parameter options section here: http://pytorch.org/docs/optim.html

Instead to feeding in a generator over all parameters, pass an iterable of dicts, each with the key params and the value as the parameter group. It should be simple to group our model params according to the learning rates you want to apply to them.

6 Likes