I have a model and this model consists of several networks
i.e.,
class ModelX(nn.Module):
def __init__(self, args):
super(Model, self).__init__()
self.netA = NetworkA(args)
self.netB = NetworkB(args)
def forward(self, input)
outA = self.netA(input)
outB = self.netB(outA)
return outA, outB
...
criterionA = nn.MSELoss()(yA, yA_pred)
criterionB = nn.CrossEntropyLoss()(yB, yB_pred)
...
input, oriA, oriB = dataloader()...
model = modelX(args)
outA, outB = model(in)
lossA = criterionA(oriA, outA)
lossB = criterionB(oriB, outB)
...
in this structure, the dynamic range of lossA is [0, 11] and loss B is [0, 4]. In addition, as described above, the network A is near to generator and network B is similar to classifier.
Since dynamic range of loss is different, simple joint training (loss =lossA+lossB and loss.backward()) does not
Which is the most appropriate method to update this model?
1) calculate gradient and update weight matrices independently
lossA.backward()
lossB.backward()
…
optimizerA.step()
optimizerB.step()
2) normalize loss function and jointly update
loss = alphalossA + betalossB
loss.backward()
optimizer.step()