I am looking at some code from fb here:
which uses stepwise lr scheduler as follows (ignoring cos lr scheduler):
def adjust_learning_rate(optimizer, epoch, args):
"""Decay the learning rate based on schedule"""
lr = args.lr
for milestone in args.schedule:
lr *= 0.1 if epoch >= milestone else 1.
for param_group in optimizer.param_groups:
param_group['lr'] = lr
and the corresponding scheduler is this:
parser.add_argument('--schedule', default=[120, 160], nargs='*', type=int,
help='learning rate schedule (when to drop lr by 10x)')
my understanding from the above is that at epochs 120 and 160, the learning rate should be dropped by 10x.
However, I think what these lines do:
for milestone in args.schedule:
lr *= 0.1 if epoch >= milestone else 1.
is to drop lr at 121,122,123… It kind of does not make sense to me or am I missing something here? I think it should be:
for milestone in args.schedule:
lr *= 0.1 if epoch == milestone else 1.
I wonder if I am missing something fundamental here?