What is the most appropriate method to update weights in transfer learning or alternative learning?


(HY kang) #1

I have a model and this model consists of several networks
i.e.,

class ModelX(nn.Module):
   def __init__(self, args):
        super(Model, self).__init__()
        self.netA = NetworkA(args)
        self.netB = NetworkB(args)
   def forward(self, input)
        outA = self.netA(input)
        outB = self.netB(outA)
    return outA, outB

...
criterionA = nn.MSELoss()(yA, yA_pred)
criterionB = nn.CrossEntropyLoss()(yB, yB_pred)

...
input, oriA, oriB = dataloader()...

model = modelX(args)
outA, outB = model(in)

lossA = criterionA(oriA, outA)
lossB = criterionB(oriB, outB)

...

in this structure, the dynamic range of lossA is [0, 11] and loss B is [0, 4]. In addition, as described above, the network A is near to generator and network B is similar to classifier.

Since dynamic range of loss is different, simple joint training (loss =lossA+lossB and loss.backward()) does not

Which is the most appropriate method to update this model?

1) calculate gradient and update weight matrices independently
lossA.backward()
lossB.backward()

optimizerA.step()
optimizerB.step()

2) normalize loss function and jointly update
loss = alphalossA + betalossB
loss.backward()
optimizer.step()