I’m training a net with a multi-task problem, let’s say that the input is a facial image, and the 2 losses are cross entropy on gender and cross entropy age (age having 101 different classes) so that total_loss=loss_age+loss_gender

The problem is that some images have a missing label, eg. an image when only an age is given, or just the gender.
I know that I should set the loss to 0 (or any constant) for a missing label, but I am not sure if I should manually write a new loss/criterion or if there’s an elegant way to write that in pytorch? How would you go about it?

Classification works better for age estimation (take a look at any age estimation paper or the recent papers from cvpr).

Also, my original suggestion of setting the loss to 0 or a constant is way way cleaner than your suggestion. My question was is there a more elegant way.

Regarding the underlying mathematics, it is no difference between setting the loss to zero and not calculating/backpropagating it at all, since a loss value of zero means, that there are no mistakes and thus no gradient update is necessary (which is the same result if you don’t calculate a loss for this part at all).

On the implementation side you could use negative values to indicate missing variables and thus simply do something like

pred[target<0] = target[target<0]

and later on just calculate the loss as you did before. This would result in a loss-value of 0 for the considered prediction/data sample as the prediction and the target would be same-valued.

@justusschock Good point with zero loss not contributing to backprop. In newer versions of PyTorch, nn.CrossEntropyLoss has an ignore_index parameter which applies your logic internally to save a bit of code. For instance, no matter what the pred tensor is, the loss will always be zero if target contains only zeroes

import torch
import torch.nn as nn
criterion = nn.CrossEntropyLoss(ignore_index=0)
pred = torch.randn(3, 5)
target = torch.zeros(3, dtype=torch.long)
criterion(pred, target)