What does scheduler.step() do?

voqtuyen · June 12, 2019, 3:46pm

Why do we have to call scheduler.step() every epoch like in the tutorial by pytorch:

Observe that all parameters are being optimized

optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

Decay LR by a factor of 0.1 every 7 epochs

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
What if we don’t call it?

ptrblck · June 12, 2019, 3:49pm

If you don’t call it, the learning rate won’t be changed and stays at the initial value.

voqtuyen · June 12, 2019, 3:53pm

But it still follows something like:
lr = 0.001 if epoch < 7
lr = 0.0001 if 7 <= epoch < 14
lr = 0.00001 if 14 <= epoch < 21

And for epoch = 7, if we call scheduler.step(), lr will be updated from 0.001 to 0.0001
for epoch = 8, if we call scheduler.step(), lr will be still 0.0001, right?

ptrblck · June 12, 2019, 3:59pm

It won’t follow these scheme, if you don’t call scheduler.step() in each epoch.
Here is a small example:

optimizer = optim.SGD([torch.randn(1, requires_grad=True)], lr=1e-3)
exp_lr_scheduler = optim.lr_scheduler.StepLR(optimizer,
                                             step_size=7, gamma=0.1)

for epoch in range(1, 25):
    exp_lr_scheduler.step()
    print('Epoch {}, lr {}'.format(
        epoch, optimizer.param_groups[0]['lr']))

> Epoch 1, lr 0.001
Epoch 2, lr 0.001
Epoch 3, lr 0.001
Epoch 4, lr 0.001
Epoch 5, lr 0.001
Epoch 6, lr 0.001
Epoch 7, lr 0.0001
Epoch 8, lr 0.0001
Epoch 9, lr 0.0001
Epoch 10, lr 0.0001
Epoch 11, lr 0.0001
Epoch 12, lr 0.0001
Epoch 13, lr 0.0001
Epoch 14, lr 1e-05
Epoch 15, lr 1e-05
Epoch 16, lr 1e-05
Epoch 17, lr 1e-05
Epoch 18, lr 1e-05
Epoch 19, lr 1e-05
Epoch 20, lr 1e-05
Epoch 21, lr 1.0000000000000002e-06
Epoch 22, lr 1.0000000000000002e-06
Epoch 23, lr 1.0000000000000002e-06
Epoch 24, lr 1.0000000000000002e-06

To get this behavior, you should call it in every epoch, not only if the current epoch equals the step size.

krishna511 · December 11, 2020, 5:46pm

@ptrblck Sir where should we use this schedular.step() method , after mode.eval()
or in each epoch. My editor is throwing warning saying you should use it after optimizer.step()
Also can I input step_size as [15,30,45] in sets, unfortunately that too is giving error, single number in step_size is working fine.

ptrblck · December 11, 2020, 11:08pm

You should call it after the optimizer.step() operation as given in the warning message.
To pass multiple milestones, you could use torch.optim.lr_scheduler.MultiStepLR.

krishna511 · December 12, 2020, 2:04pm

Got it Sir.Thanks again. One more thing sir is this LR schedular here is same concept as earlystop in model training in case of other frameworks, or is there something else for this?

slmatrix · December 12, 2020, 4:16pm

No. torch.optim.lr_scheduler is used to adjust only the hyperparameter of learning rate in a model.

Early stopping refers to another hyperparameter, the number of train epochs. It is the stopping of training when loss reaches a plateau. Typically, models are over-parameterized meaning that if you keep training for “too many” epochs, they will begin to overfit on junk info (noise) from the dataset. To prevent overfitting, use early stopping; this means, when the loss stops decreasing on the validation set, stop training.

Jason has a great post on early stopping here.