Adam does not work in the LSTM language modelling example

In this example, when I run ‘python --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 --tied’ with default optimizer(SGD) on dataset PTB,reaching test perplexity of 72.30
but when I use Adam instead of SGD, it become worse ,only reach test perplexity of 84
could anyone tell me why this occurs?Thank you!

Did you try tuning the learning rate for Adam?
I guess that Adam needs a different learning rate for this problem.

sure, I use Adam recommended lr (0.001) at first, then try some other learning rates but only have little change