Cyclic Learning Rate Max LR

Can the Max_LR value for CycleLR be less than the lr that is used for optim.Adam ?
I used the values below (lr 0.002 and max_lr 0.0005 (I did that by mistake) and that seemed to be giving better results than Max_LR of 0.005 )
Max_LR of 0.005 did not seem to give better result

optimizer = optim.Adam(list(net.parameters()),lr=0.002)

scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.0005, steps_per_epoch=len(dataloaders[‘train’]), epochs=total_epocs_var)

Yes, you could set the max_lr to a smaller value than the original learning rate, which would create the new limit as seen here:

net = nn.Linear(1, 1)
optimizer = torch.optim.Adam(list(net.parameters()),lr=0.002)
scheduler = torch.optim.lr_scheduler.OneCycleLR(
    optimizer, max_lr=0.0005, steps_per_epoch=2, epochs=10)

for epoch in range(20):

> [6.58359213500126e-05]
1 Like

Thanks @ptrblck for your kind reply
I will experiment with the code block that you have shared and see the difference if lr was also small like 0.0005 vesus a large lr of 0.002
I guess that lr might start with larger value and then settle into the max value set by max_lr

What could be the advantages of such a setting ? (or any draw backs too of such setting)

One drawback could be a slightly confusing code, as the optimizer defines another learning rate than the scheduler is allowing. Besides that the scheduler will drive the learning rate, so I don’t think there are other issues.