Print current learning rate of the Adam Optimizer?

Jonathan_R_Williford · November 8, 2018, 2:08am

Adam has a separate learning rate for each parameter. The param_group['lr'] is a kind of base learning rate that does not change. There is no variable in the PyTorch Adam implementation that stores the dynamic learning rates.

One could save the optimizer state, as mentioned here:

The PyTorch implementation of Adam can be found here:
https://pytorch.org/docs/stable/_modules/torch/optim/adam.html

The line for p in group['params']: iterates over all the parameters and calculates the learning rate for each parameter.

I found this helpful: http://ruder.io/optimizing-gradient-descent/index.html#adam