What does scheduler.step() do?

(Voqtuyen) #1

Why do we have to call scheduler.step() every epoch like in the tutorial by pytorch:

Observe that all parameters are being optimized

optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

Decay LR by a factor of 0.1 every 7 epochs

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
What if we don’t call it?

#2

If you don’t call it, the learning rate won’t be changed and stays at the initial value.

(Voqtuyen) #3

But it still follows something like:
lr = 0.001 if epoch < 7
lr = 0.0001 if 7 <= epoch < 14
lr = 0.00001 if 14 <= epoch < 21

And for epoch = 7, if we call scheduler.step(), lr will be updated from 0.001 to 0.0001
for epoch = 8, if we call scheduler.step(), lr will be still 0.0001, right?

#4

It won’t follow these scheme, if you don’t call scheduler.step() in each epoch.
Here is a small example:

optimizer = optim.SGD([torch.randn(1, requires_grad=True)], lr=1e-3)
exp_lr_scheduler = optim.lr_scheduler.StepLR(optimizer,
                                             step_size=7, gamma=0.1)

for epoch in range(1, 25):
    exp_lr_scheduler.step()
    print('Epoch {}, lr {}'.format(
        epoch, optimizer.param_groups[0]['lr']))

> Epoch 1, lr 0.001
Epoch 2, lr 0.001
Epoch 3, lr 0.001
Epoch 4, lr 0.001
Epoch 5, lr 0.001
Epoch 6, lr 0.001
Epoch 7, lr 0.0001
Epoch 8, lr 0.0001
Epoch 9, lr 0.0001
Epoch 10, lr 0.0001
Epoch 11, lr 0.0001
Epoch 12, lr 0.0001
Epoch 13, lr 0.0001
Epoch 14, lr 1e-05
Epoch 15, lr 1e-05
Epoch 16, lr 1e-05
Epoch 17, lr 1e-05
Epoch 18, lr 1e-05
Epoch 19, lr 1e-05
Epoch 20, lr 1e-05
Epoch 21, lr 1.0000000000000002e-06
Epoch 22, lr 1.0000000000000002e-06
Epoch 23, lr 1.0000000000000002e-06
Epoch 24, lr 1.0000000000000002e-06

To get this behavior, you should call it in every epoch, not only if the current epoch equals the step size.

2 Likes