Hi,
I’m trying to train an LSTM network, and using Adam as optimizer.
What is the recommended learning rate scheduler to use, that usually fits best to Adam?
I was under the impression that Adam controls the learning rate internally, but I see that if I manually reduce the learning rate when the validation loss reaches a plateau, I manage to further reduce the loss.
To my best of my knowledge, it depends on the model you are training.
Personally, I decay it by 0.1 if the validation loss rise.
BTW, it seems RMSprop is used more often in LSTM.
Hi @guyrose3, How are you reducing the learning rate while using Adam? I’m trying to do a similar process, but it seems that PyTorch’s implementation of Adam does not account for an external learningrate scheduler.
I’ve started a thread on Stack Exchange. See this for more info: