But it still follows something like:
lr = 0.001 if epoch < 7
lr = 0.0001 if 7 <= epoch < 14
lr = 0.00001 if 14 <= epoch < 21
And for epoch = 7, if we call scheduler.step(), lr will be updated from 0.001 to 0.0001
for epoch = 8, if we call scheduler.step(), lr will be still 0.0001, right?
It won’t follow these scheme, if you don’t call scheduler.step() in each epoch.
Here is a small example:
optimizer = optim.SGD([torch.randn(1, requires_grad=True)], lr=1e-3)
exp_lr_scheduler = optim.lr_scheduler.StepLR(optimizer,
step_size=7, gamma=0.1)
for epoch in range(1, 25):
exp_lr_scheduler.step()
print('Epoch {}, lr {}'.format(
epoch, optimizer.param_groups[0]['lr']))
> Epoch 1, lr 0.001
Epoch 2, lr 0.001
Epoch 3, lr 0.001
Epoch 4, lr 0.001
Epoch 5, lr 0.001
Epoch 6, lr 0.001
Epoch 7, lr 0.0001
Epoch 8, lr 0.0001
Epoch 9, lr 0.0001
Epoch 10, lr 0.0001
Epoch 11, lr 0.0001
Epoch 12, lr 0.0001
Epoch 13, lr 0.0001
Epoch 14, lr 1e-05
Epoch 15, lr 1e-05
Epoch 16, lr 1e-05
Epoch 17, lr 1e-05
Epoch 18, lr 1e-05
Epoch 19, lr 1e-05
Epoch 20, lr 1e-05
Epoch 21, lr 1.0000000000000002e-06
Epoch 22, lr 1.0000000000000002e-06
Epoch 23, lr 1.0000000000000002e-06
Epoch 24, lr 1.0000000000000002e-06
To get this behavior, you should call it in every epoch, not only if the current epoch equals the step size.
@ptrblck Sir where should we use this schedular.step() method , after mode.eval()
or in each epoch. My editor is throwing warning saying you should use it after optimizer.step()
Also can I input step_size as [15,30,45] in sets, unfortunately that too is giving error, single number in step_size is working fine.
You should call it after the optimizer.step() operation as given in the warning message.
To pass multiple milestones, you could use torch.optim.lr_scheduler.MultiStepLR.
Got it Sir.Thanks again. One more thing sir is this LR schedular here is same concept as earlystop in model training in case of other frameworks, or is there something else for this?
No. torch.optim.lr_scheduler is used to adjust only the hyperparameter of learning rate in a model.
Early stopping refers to another hyperparameter, the number of train epochs. It is the stopping of training when loss reaches a plateau. Typically, models are over-parameterized meaning that if you keep training for “too many” epochs, they will begin to overfit on junk info (noise) from the dataset. To prevent overfitting, use early stopping; this means, when the loss stops decreasing on the validation set, stop training.