What's wrong with my Adam implementation?

aerinykim · July 17, 2018, 6:57am

Hello PyTorch community!
I’m trying to implement Adam by myself for a learning purpose.

I think I implemented everything correct however the loss graph of my implementation is very spiky compared to that of torch.optim.Adam.

My ADAM implementation loss graph (below)

torch.optim.Adam loss graph (below)

If someone could look at my code and tell me what I am doing wrong, I’ll be very grateful. Thank you for PyTorch!
(For the full code including data (super easy to run): AMS_pytorch/AdamFails_1dConvex.ipynb at master · aerinkim/AMS_pytorch · GitHub)