Generalized_box_iou_loss diverge to negative value

Mohammed_Nassar · August 23, 2023, 11:19am

Hello, PyTorch community,

I’m currently working on an object detection task and I’m interested in implementing the Generalized Intersection over Union (GIoU) Loss instead of the usual MSELoss. While referring to the generalized_box_iou_loss function in PyTorch, I noticed that this loss function expects bounding box values to adhere to the condition 0 <= x1 < x2. I have a couple of questions regarding this:

My bounding box regression values are normalized with respect to the image width and height. Should I directly input these normalized values into the GIoU loss function, or is it necessary to denormalize them before use?
During the initial training phase, the model’s output can be quite unpredictable. This makes it challenging to ensure that the condition 0 <= x1 < x2 is always satisfied, potentially leading to negative loss values. In such cases, the optimizer tends to drive the loss towards negative infinity.

I’m seeking guidance on how to effectively train using the GIoU loss function under these conditions. Any insights or recommendations would be greatly appreciated.

Thank you in advance for your assistance!

Valay_Bundele · May 1, 2024, 4:38pm

Hi,

I have the same doubts and was wondering if you have already figured this out. Please let me know.

Thanks in advance!

miotto · May 5, 2024, 7:22am

There is a couple of ways of going about this problem, here is a suggestion:

Work with bounding boxes in the YOLO format (center_x, center_y, bb_width, bb_height) normalized with respect to (image_width, image_height). The normalization ensures that all target bounding boxes are represented by values between 0 and 1.
Call torch.sigmoid() on the output of your model. This keeps all predicted values in the range 0 to 1.

Using this strategy, it is impossible for the model to output an invalid bounding box. Also, the center of the bounding box, and at least a quarter of its area, will be always inside the image.