I have 2 models m1
and m2
and 2 losses l1
and l2
I want to train both models jointly.
m1
needs to be updated using (l1 + l2)
and m2
only by l2
.
To achieve this, I have 2 optimizers o1
(for m1
's parameters) and o2
(for m2
's parameters)
Is the following approach correct for achieving what I want?
y1 = m1(x1)
y2 = m2(x2)
loss1 = l1(y1, y) # y is ground truth
loss2 = l2(y1, y2) # l2 works on the outputs for m1 and m2
loss = loss1 + loss2
loss.backward()
o1.step()
o2.step()
Here, I make the assumption that d loss / d m2
will be same as d loss2 / d m2