Calculate partial losses only


I’m sorry if this is kind of a noob question, but I am unable to find an answer to this anywhere.

I want to train a pose detection network, which outputs a confidence score and 3D coordinates of specific landmarks. The data has both of these labels as well: a flag whether something detectable is on the image, and if so, the 3D coordinates it should detect.

I now want to calculate the loss for these values. My thinking was, that I need to have a binary classification loss for the confidence, and a regression loss for the coordinates. However, when nothing detectable is present, I obviously do not want to have any gradients for the coordinates, as there is no correct way to change them, but I do want gradients for the confidence, that something is present. Is it okay to just mask out those values where confidence should be 0 within the keypoint tensor? Does this still calculate the right gradients? Or is it an entirely wrong approach?

Just to visualize better, here is a code snippet of what the loss calculation would look like:

class CoordinateLoss(nn.Module):
    def forward(self, prediction, target):
        confidence_loss_fn = nn.BCELoss()
        keypoint_loss_fn = nn.MSELoss()

        detection_possible_mask = target[:, 0] == 1

        confidence_loss = confidence_loss_fn(
            prediction[0][:, 0], target[:, 0].float())

        keypoint_loss = keypoint_loss_fn(
            prediction[2][detection_possible_mask], target[:, 2:][detection_possible_mask].float())
        return confidence_loss + keypoint_loss

Thanks a lot for your help!

Yes. Your code looks fine to me.
I assume that during inference, you would have a threshold on the confidence values.