I have a model and this model consists of several networks

i.e.,

```
class ModelX(nn.Module):
def __init__(self, args):
super(Model, self).__init__()
self.netA = NetworkA(args)
self.netB = NetworkB(args)
def forward(self, input)
outA = self.netA(input)
outB = self.netB(outA)
return outA, outB
...
criterionA = nn.MSELoss()(yA, yA_pred)
criterionB = nn.CrossEntropyLoss()(yB, yB_pred)
...
input, oriA, oriB = dataloader()...
model = modelX(args)
outA, outB = model(in)
lossA = criterionA(oriA, outA)
lossB = criterionB(oriB, outB)
...
```

in this structure, the dynamic range of lossA is [0, 11] and loss B is [0, 4]. In addition, as described above, the network A is near to generator and network B is similar to classifier.

Since dynamic range of loss is different, simple joint training (loss =lossA+lossB and loss.backward()) does not

Which is the most appropriate method to update this model?

**1) calculate gradient and update weight matrices independently**

lossA.backward()

lossB.backward()

…

optimizerA.step()

optimizerB.step()

**2) normalize loss function and jointly update**

loss = alpha*lossA + beta*lossB

loss.backward()

optimizer.step()