I am using the
OneCycleLR scheduler. It has a
max_lr parameter which is the:
Upper learning rate boundaries in the cycle for each parameter group.
I also have an
Adam optimizer. This optimizer also has a learning rate.
Will the optimizer’s learning rate be overwritten by the scheduler’s? How do they relate?
I guess the only point of specifying a learning rate in the optimizer is if you do not use any scheduler, in which the learning rate will be constant throughout the training stage, but I am not entirely sure.