OneCycleLR Learning Rate vs. Optimizer Learning Rate

moth · October 18, 2023, 6:15pm

I am using the OneCycleLR scheduler. It has a max_lr parameter which is the:

Upper learning rate boundaries in the cycle for each parameter group.

I also have an Adam optimizer. This optimizer also has a learning rate.

Will the optimizer’s learning rate be overwritten by the scheduler’s? How do they relate?

I guess the only point of specifying a learning rate in the optimizer is if you do not use any scheduler, in which the learning rate will be constant throughout the training stage, but I am not entirely sure.

jonathanjsmith · October 18, 2023, 10:24pm

Creating the scheduler OneCycleLR will change the learning rate of the optimizer. So, there is not really a point of specifying the learning rate in the optimizer if you use a scheduler, but the optimizer will have a learning rate by default when its created, but it will just be changed if you pass it to a scheduler.

sad_robot · September 10, 2024, 5:23pm

To play devil’s advocate it seems that in the Pytorch docs they show the optimizer being given a different learning rate (0.1) from the max_lr (0.01) in the OneCycleLR scheduler.

It seems strange they would specify optimizer LR at all much less to a different value if it did nothing…