Adam+Half Precision = NaNs?

I had previously tried upping epsilon after tagging it as the culprit, but I can’t recall to exactly what values–as of right now I’m training with eps=1e-4 and it’s working just fine. Guess I should have dug into that further, thanks!

3 Likes