I use chatgpt to learn linear regression, but I don’t understand why it can’t predict?
Where is the mistake?
Epoch 400/1000, Loss: nan
Epoch 500/1000, Loss: nan
Epoch 600/1000, Loss: nan
Epoch 700/1000, Loss: nan
Epoch 800/1000, Loss: nan
Epoch 900/1000, Loss: nan
Epoch 1000/1000, Loss: nan
预测的花费金额: nan
The problem is that your training is unstable – the gradient is big, each step is big
enough to make things worse, and the next gradient is even bigger. (Becoming
unstable in this way is commonplace with gradient-based optimization algorithms
such as SGD.)
The root cause of this problem is that your data is of order 100, while, by default,
your Linear is (randomly) initialized to be appropriate for data of order one. The
preferred approach would be to normalize your data to be of order one and train with
the normalized data.
However, you can also lower your learning rate, which makes your steps smaller,
eliminating the instability. This does slow your training down, but this can often be
addressed by using momentum with SGD. For your particular use case, a value for
the momentum quite close to one can be appropriate.
Here is a tweaked version of your code that shows stable training without and with momentum: