I am looking for some guidance on the correct way to use/order scheduler.step()
within epochs. So, ofcourse, the official guidance says (torch.optim — PyTorch 2.1 documentation):
PATTERN:0
scheduler = …
>>> for epoch in range(100):
>>> train(...)
>>> validate(...)
>>> scheduler.step()
But then I also see code like this (see for example: Learning Rate Scheduling - Deep Learning Wizard):
PATTERN:1
for epoch in range(num_epochs):
# Decay Learning Rate
scheduler.step()
# Print Learning Rate
print('Epoch:', epoch,'LR:', scheduler.get_lr())
for i, (images, labels) in enumerate(train_loader):
# Load images
images = images.view(-1, 28*28).requires_grad_()
I also see this ^ similar pattern in quite a few github repos.
So, I did some small scale experiments using these two patterns in Pytorch1.6 and I got slightly better results using PATTERN:1.
So, my questions are:
[1] Does the order matter?
[2] Does the order depend on step size
or epoch number
or optimizer
?
Also, I have another problem: I have two envs with pytorch 1.6 and I set PATTERN:1 for both. In one of the envs I get the warning:
UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/home/ok/ok0/ok1/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:351: UserWarning: To get the last learning rate computed by the scheduler, please use `get_last_lr()`.
"please use `get_last_lr()`.", UserWarning)
but in the other env, I dont Cant think why this could be!
Thank you!