One Loss function with multiple optimizer

zwccool · October 10, 2017, 11:46am

Hello, Am I able to do backpropagation of one single loss with two optimizer at the same time?

E.g.

Loss = Loss 1 * (Adam_w) + Loss 2 * (SGD_w)

Loss.backward()

Adam_optimizer.step()
SGD_optimizer.step()

Is this possible? How to make this work?

Thanks a lot for any answer.

chenyuntc · October 10, 2017, 12:46pm

I think it’s fine,

loss.backward calculate grad
optimizer.step update param by something like w-=lr*grad
I think they are independent.

an example of multi-optimizer

github.com

pytorch/examples/blob/master/dcgan/main.py#L206-L207


optimizerD = optim.Adam(netD.parameters(), lr=opt.lr, betas=(opt.beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=opt.lr, betas=(opt.beta1, 0.999))

zwccool · October 10, 2017, 11:29pm

Thanks a lot for your answer. But in that example, the two losses are reduced independently, if I want to reduce them in one loss function is this still possible?

raulpuric · October 10, 2017, 11:48pm

so the call to .backward fills a variable.grad parameter with a value (or adds to it if you need to do accumulation)
.step simply applies the gradient to the variable.

Two optimizer.step calls will both apply variable.grad to a variable. If this is what you desire (which does not seem to be the case) then what you’re asking for will work.

Instead what you seem to want is two separate .grad attributes per variable that will be optimized by the respectiveb optimizers.

If you wanted to do this sort of thing, your best bet is messing with backward hooks.