Using single optimizer to train both generators in CycleGan

I was looking at cycleGAN’s implementation in PyTorch by the author of the paper. In the code author chained the parameters of both generators and passed them to Adam optimizer. I don’t understand the intuition behind training both networks with single optimizer. Shouldn’t we use different optimizers for different networks ? Where am I wrong ?

The code I looked upas two optimizers: optimizer_G and optimizer_D. Can you point to the code that you are referring to?

Yes the code uses optimizer_G to train Generator AB and Generator BA. They are different models and need different gradients so why did the author use one optimizer ? Perhaps I am understanding it wrong.

loss_G = loss_identity_A + loss_identity_B + loss_GAN_A2B + loss_GAN_B2A + loss_cycle_ABA + loss_cycle_BAB

why are all the losses being added when both models have different parameters ?