How to adjust importance between two weights

oasjd7 · November 30, 2018, 11:55am

Here is my loss structure.
model1 and model2 have same balance tothe total loss.

learnable_params = list(model1.parameters)) + list(model2.parameters))
optimizer = optim.Adam(learnable_params, lr=0.01, betas=(0.9,0.999))
...
optmizer.zero_grad()
...
total_loss = model1_loss + model2_loss
total_loss.backward()
optimizer.step()

If I want to model1 affect to the whole network more than model2’s, multiplying some value to model1_loss works well like this?

total_loss = model1_loss *0.5 + model2_loss

or should I change some gradient ?

vmirly1 · November 30, 2018, 2:15pm

Well, according to what you have described: “model1 affects to the whole network more than model2”, then you probably want to decrease the weight of model 2 instead.

I have played around changing the weights of different loss terms. If the two have the same type of loss, for example both are MSE_loss, or both are CrossEntropyLoss, then it should work as you expect. But, I have seen this weighting schemes becomes ineffective if the two losses are different.

Another idea you can try is to apply these losses in different steps, for example, model1_loss is applied every iteration, but model2_loss is only applied every 2 iterations.

yuwenchao · November 30, 2018, 2:42pm

you could also introduce one hyperparameter \lambda, and rewrite your loss function as:
total_loss = model1_loss *\lambda + model2_loss

\lambda could be fine tuned.