Loss: inf & Parameters: nan - Why?

mayurat22 · July 11, 2020, 2:19am

I am training a simple polynomial Model w2 * t_u ** 2 + w1 * t_u + b.

Implementation Details

After few epochs, the loss tends to inf and parameters move to nan as in below image

Can anyone explain why it happens and how to avoid it ?

mayurat22 · July 11, 2020, 2:35am

I think the reason may be,
Few reasons

Parameters updates are too large and its overshooting the gradient. The optimization process is unstable, it diverges instead of converging to a minimum.
Since weights and bias are at extreme end after first epoch, it continues to fluctuate causing loss to move to inf. Solution is to normalize the X to [-1, 1] or [0,1].
I was using SGD, which is sensitive to scaling and makes the parameters to overshoot. If i use Adam optimizer i.e. adaptive learning rate the model tends to find convergence even without scaling down the X to [-1, 1] or [0, 1].