I have two losses, one is normal MSELoss and another is a custom loss function that I have made for regularization.
The problem that I have is that these losses are not necessarily on the same numerical scale, so I have to figure out how to weight them every time (and divide/multiply one by a constant so they are the same scale). I would much prefer if I could set the relative weight of each loss once and not worry about it again, such that loss1 constitutes 99% of my loss value and loss2 constitutes 1% (so some sort of normalized weighting)
I had previously added the two different loss functions together like this:
batch_loss = reconstruction_loss + monotonic_loss
But instead I want to normalize the losses so I can choose how much they contribute to parameter updates, this is what I was thinking.
batch_loss = CombineLosses([reconstruction_loss, monotonic_loss], [0.99,0.01])
def CombineLosses(losses, weights):
combined_loss = torch.Tensor([0])
for loss, wt in zip(losses, weights):
combined_loss = combined_loss + ((loss * wt) / loss.item())
return combined_loss
The idea is that then I can weight them and if my regularization function sums to 10E8 and MSELoss to 0.16 I can still incorporate both. However, I might have a lack of understanding about how autograd works so I have a couple questions:
- Will scaling my two loss functions actually limit each loss’s contribution proportionally to it’s weight? If not, what is the preferred way to combine losses of different scales in PyTorch?
- As an alternative, could it be possible to alternate between training on one loss function and training on the other? i.e. switching off epochs
Thanks in advance