Hi,
I’m trying to use a couple of torch.optim.lr_scheduler
s together, but I don’t seem to be getting the results I’m expecting.
I read #13022 and #26423, and my understanding is that one should simply create multiple lr_scheduler
s and call step
on all of them at the end of each epoch.
However, running:
from torch.optim import SGD, lr_scheduler
model = ... # Doesn't really matter, use anything you like
optim = SGD(model.parameters(), 0.1)
scheduler1 = lr_scheduler.LambdaLR(optim, lambda epoch: min(epoch / 3, 1))
scheduler2 = lr_scheduler.MultiStepLR(optim, [5, 8])
for epoch in range(10):
print(epoch, optim.param_groups[0]['lr'])
scheduler1.step()
scheduler2.step()
The output I expected is:
0 0.0333...
1 0.0666...
2 0.1
3 0.1
4 0.1
5 0.01
6 0.01
7 0.01
8 0.001
9 0.001
i.e. a linear decrease in the first 3 epochs, constant lr until epoch 5, a decrease by a factor of 0.1, and again constant lr until epoch 8, followed with a decrease by a factor of 0.1.
But the actual output is:
0 0.03333333333333333
1 0.06666666666666667
2 0.1
3 0.1
4 0.1
5 0.010000000000000002
6 0.1
7 0.1
8 0.010000000000000002
9 0.1
the initial linear decrease happens as expected, however, after each milestone, the lr reverts back to 0.1.
Am I doing something wrong? Any help would be appreciated.