Loss: inf & Parameters: nan - Why?

I am training a simple polynomial Model w2 * t_u ** 2 + w1 * t_u + b.

Implementation Details

After few epochs, the loss tends to inf and parameters move to nan as in below image

Can anyone explain why it happens and how to avoid it ?

I think the reason may be,
Few reasons

  1. Parameters updates are too large and its overshooting the gradient. The optimization process is unstable, it diverges instead of converging to a minimum.
  2. Since weights and bias are at extreme end after first epoch, it continues to fluctuate causing loss to move to inf. Solution is to normalize the X to [-1, 1] or [0,1].
  3. I was using SGD, which is sensitive to scaling and makes the parameters to overshoot. If i use Adam optimizer i.e. adaptive learning rate the model tends to find convergence even without scaling down the X to [-1, 1] or [0, 1].
1 Like