This topic isn’t strictly related to pytorch, but to computer vision. The 2D bounding box intersection over union (IOU) is relatively straightforward to calculate for a ground truth bounding box versus a model output, and thus translates neatly into a loss function. For 3D object detections, it would be nice to extend the IOU concept into 3D. Unfortunately, as far as I can tell there isn’t a (simple) way to calculate the intersection between two 6-sided 3-dimensional space; even 2D IOU with arbitrary quadrilaterals instead of axis-constrained rectangles is somewhat difficult.

I am wondering if anyone knows either: a. how the 3D bounding box IOU is calculated (I have found papers that reference it but never with a description of how it is calculated) or b. what loss functions are used instead of one based on 3D volume IOU.