What is the most appropriate method to update weights in transfer learning or alternative learning?

(HY kang) #1

I have a model and this model consists of several networks

class ModelX(nn.Module):
   def __init__(self, args):
        super(Model, self).__init__()
        self.netA = NetworkA(args)
        self.netB = NetworkB(args)
   def forward(self, input)
        outA = self.netA(input)
        outB = self.netB(outA)
    return outA, outB

criterionA = nn.MSELoss()(yA, yA_pred)
criterionB = nn.CrossEntropyLoss()(yB, yB_pred)

input, oriA, oriB = dataloader()...

model = modelX(args)
outA, outB = model(in)

lossA = criterionA(oriA, outA)
lossB = criterionB(oriB, outB)


in this structure, the dynamic range of lossA is [0, 11] and loss B is [0, 4]. In addition, as described above, the network A is near to generator and network B is similar to classifier.

Since dynamic range of loss is different, simple joint training (loss =lossA+lossB and loss.backward()) does not

Which is the most appropriate method to update this model?

1) calculate gradient and update weight matrices independently


2) normalize loss function and jointly update
loss = alphalossA + betalossB