How to do exponential learning rate decay in PyTorch?

Epoching · December 6, 2019, 3:28am

Ah it’s interesting how you make the learning rate scheduler first in TensorFlow, then pass it into your optimizer.

In PyTorch, we first make the optimizer:

my_model = torchvision.models.resnet50()

my_optim = torch.optim.Adam(params=my_model.params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, amsgrad=False)

Note that optimizers in PyTorch typically take the parameters of your model as input, so an example model is defined above. The arguments I passed to Adam are the default arguments, you can definitely change the lr to whatever your starting learning rate will be.

After making the optimizer, you want to wrap it inside a lr_scheduler:

decayRate = 0.96
my_lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer=my_optim, gamma=decayRate)

Then train as usual in PyTorch:

for e in epochs:
    train_epoch()
    valid_epoch()

    my_lr_scheduler.step()

Note that the my_lr_scheduler.step() call is what will decay your learning rate every epoch. train_epoch() and valid_epoch() are passing over your training data and test/valid data. Be sure to still step with your optimizer for every batch in your training data! In other words, you still have to use the my_optim.zero_grad(), loss.backward(), and my_optim.step() calls. Just don’t get the steps confused for your actual optimizer and your lr_scheduler, you still need them both.

Here’s a good example:
TorchVision Object Detection Finetuning Tutorial