Two optimizers for one model

Oh I see that you want to use two optimizers for two paths. The simplest way is to activate twice, and backward+step after each activate.

It’s kinda tricky if you don’t wanft to calculate A1(input) twice. You would need to do something like:

optim1.zero_grad()
temp = A1(input)
temp_d = temp.detach()
temp_d.requires_grad = True
res1 = B1(temp_d)
loss1 = critertion(res1, target)
temp.backward(autograd.grad(loss1, temp_d, only_input=False)[0], retain_graph=True)
optim1.step()

optim2.zero_grad()
res2 = B2(A2(temp))
loss2 = critertion(res2, target)
loss2.backward()
optim2.step()
3 Likes