Simple Linear Regression with Pytorch

richard · December 12, 2017, 4:17pm

I think the reason why the small learning rate works is that because the values in the data are large, the small learning rate prevents the gradients from exploding. That’s why normalizing the data helps