Is there a reason to use two optimizer?

Paarulakan · November 17, 2017, 10:31pm

Is there a reason to use two optimizer instead of one for both encoder and decoder in the following tutorial?

http://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html#training-the-model

SimonW · November 18, 2017, 12:22am

In this case, it is equivalent to use one. But multiple optimizers are useful when you either want different optimization algorithm for different parts or optimize different set of parameters at each time.

ataxias · February 26, 2021, 6:19am

Separate optimizers will also allow you to have separate schedulers, in case e.g. you are fine-tuning a pretrained model, and you want to slowly ramp-up the learning rate for the low-level (first) part of the model, while the final part will have a higher learning rate from the beginning.