Get the best learning rate automatically

Hi @shirui-japina,

There is actually a guy called Leslie N. Smith who created this paper.

Based on this paper, some other guy created the learning rate finder.

What, it does, it measures the loss for the different learning rates and plots the diagram as this one:

image

It shows up (empirically) that the best learning rate is a value that is approximately in the middle of the sharpest downward slope.

However, the modern practice is to alter the learning rate while training described in here.

At the end you would probable do learning rate annealing.


[first image is the learning rate second is the momentum in time]

It really depends on optimization algorithm but the techniques from fastai work well on Adam and SGD.

Lastly some newer optimizer algorithms don’t even care about the learning rate ;).

11 Likes