I’m wondering how many optimizers are needed to build some architectures.
If my architectures are composed with 3-classifiers and two-encoders,
-> Classifer1 Encoder1 -> Encoder2 -> Classifier2 -> Classifier3
One optimizer can train this model well?
When I use one optimizer, I sum up all losses.(In this case, three kinds of classification loss). So I’m not sure that three modules are well trained simultaneously.