Learning rate in adam

robinho · August 6, 2022, 2:07am

How to choose an optimum initial learning rate in adam?
When it’s too big, why does the net converge to a wrong solution? Normally too big a learning rate should make the loss oscillate too much but not get stuck in a wrong solution?

ptrblck · August 6, 2022, 2:27am

You could either run a few experiments using different learning rates or use some utils. which could find the “optimal” learning rate e.g. such as the learn.lr_find() operation from fastai.

robinho · August 6, 2022, 3:04am

thanks so much. Is this learn.lr_find() restricted to any particular network? I still don’t know how to use it as I don’t use the net in the link you posted.

ptrblck · August 6, 2022, 4:39am

No, I don’t think this operation depends on the model architecture. You would need to install and use the fastai package or reimplement their learning rate finder in pure PyTorch.