Ignore loss on some outputs depending on others?

supagu · January 23, 2023, 6:53am

I’m trying to guess a bounding box + a few sets of class labels based on an image.
It seems my bounding box code is struggling to learn, and I suspect it might be because I have the situation where, if a certain class label is set, then there is no need for a bounding box, so I have just returned [0,0,0,0] for the bounding box in that case.

here is how I make a prediction:

pred_bounds, pred_node, pred_display = model(X)

loss_1 = criterion_1(pred_bounds, Y_bounds)
        loss_2 = criterion_2(pred_node, Y_node_class)
        loss_3 = criterion_3(pred_display, Y_display_class)
        loss_total = loss_1/1000.0 + loss_2 + loss_3

I’m trying to predict the bounding box + a class label called “node” + another class label called “display” (these labels are one-hot encoded).
If node == [0,1] then I actually don’t care about any loss value for the bounding box is, as this class label means I can ignore the bounding box.
Should I some how conditionally ignore bounding box loss (loss_1) for those data samples that I don’t care what the bounding boxes should be?
How do I do that?

Also, I’m scaling my bounding box error down (see the /1000) with the thinking that perhaps it will make smaller adjustments to the bounding box nodes…I’m not sure if I need to/should be doing that also?

eqy · January 23, 2023, 7:53am

I would check if simply setting the loss to zero for those examples would make a difference, provided that you can pass or compute this information in your criterion function.

Yes, scaling the relative weight of the different parts of your loss calculation is a common practice, even if it does introduce additional hyperparameters. A somewhat famous example: Andrej Karpathy: Tesla Autopilot and Multi-Task Learning for Perception and Prediction - YouTube

supagu · January 23, 2023, 10:38pm

Thanks for that egy!