High learning rate vs

111242 · September 9, 2021, 1:17am

I am aware that some previous posts discuss similar issues, but I didn’t find a solution.

In my situation, when I have a learning rate that is too high, such as 0.001 (I know this is not high for other instances but I guess it is for mine), I get NaNs as the output of my NN model after a few iterations. If I lower the lr to something like 0.00001, the NN outputs reasonable numbers, but the updates after each epoch are negligible (essentially seeing no updates). What is a good way to fix this?

ptrblck · September 9, 2021, 5:58am

Manually finding a sweet spot for the learning rate might be tricky and I believe that FastAI has written a “learning rate finder”, which tries to find a learning rate yielding the largest loss decrease, so you might want to search for their implementation.