Why is my Feed Forward Network predicting NAs?

I did not see a category for noob question or tabular data. Let me know if I need to be in a different category.

I am just learning the basics of Pytorch. I am confused as to why I am getting NAs or poor predictions. My input data was generated from a simple polynomial.

The code is on github here

The data is on github here

The learning rate seems like it could be too high here, setting it to 0.0002 seems to help.

That is interesting. I guess I would have expected error rate problems to result in bad predictions not NAs.

How to people generally adjust their learning rate given their architecture?

Learning rate scheduling is more of an art than a science in my opinion, but some kind of schedule that decays by an order of magnitude every some number of epochs is an okay baseline to start from. I’d look up the canonical graphs of “learning rate too high” (loss not really decreasing or exploding), “learning rate too low” (loss decreasing steadily/slowly for a long time but not leveling off) as well.