I have a model and this model consists of several networks
class ModelX(nn.Module): def __init__(self, args): super(Model, self).__init__() self.netA = NetworkA(args) self.netB = NetworkB(args) def forward(self, input) outA = self.netA(input) outB = self.netB(outA) return outA, outB ... criterionA = nn.MSELoss()(yA, yA_pred) criterionB = nn.CrossEntropyLoss()(yB, yB_pred) ... input, oriA, oriB = dataloader()... model = modelX(args) outA, outB = model(in) lossA = criterionA(oriA, outA) lossB = criterionB(oriB, outB) ...
in this structure, the dynamic range of lossA is [0, 11] and loss B is [0, 4]. In addition, as described above, the network A is near to generator and network B is similar to classifier.
Since dynamic range of loss is different, simple joint training (loss =lossA+lossB and loss.backward()) does not
Which is the most appropriate method to update this model?
1) calculate gradient and update weight matrices independently
2) normalize loss function and jointly update
loss = alphalossA + betalossB