I am trying to evaluate average precision for a face detection algorithm. I have created a CSV file with all the predicted bounding boxes. I have another CSV in a similar format with all the ground-truth bounding boxes. I found a lot of simple codes online to calculate bounding box overlap and calculating AP but they don’t work for a variable number of predictions given ground-truth i.e. if an image has 8 bounding box predictions whereas ground truths are 10 then how do I calculate the average precision?
I tried to mimic one of the algorithms and create a CSV file in the format below:
# Filename-1.jpg
[ x1, y1, x2, y2 ]
[ x1, y1, x2, y2 ]
[ x1, y1, x2, y2 ]
[ x1, y1, x2, y2 ]
# Filename-2.jpg
[ x1, y1, x2, y2 ]
[ x1, y1, x2, y2 ]
[ x1, y1, x2, y2 ]
# Filename-3.jpg
[ x1, y1, x2, y2 ]
[ x1, y1, x2, y2 ]
...
...
...
I am also sharing what I tried so far:
def get_max_iou(pred_boxes, gt_box):
"""
calculate the iou multiple pred_boxes and 1 gt_box (the same one)
pred_boxes: multiple predict boxes coordinate
gt_box: ground truth bounding box coordinate
return: the max overlaps about pred_boxes and gt_box
"""
# 1. calculate the inters coordinate
if pred_boxes.shape[0] > 0:
ixmin = np.maximum(pred_boxes[:, 0], gt_box[:, 0])
ixmax = np.minimum(pred_boxes[:, 2], gt_box[:, 2])
iymin = np.maximum(pred_boxes[:, 1], gt_box[:, 1])
iymax = np.minimum(pred_boxes[:, 3], gt_box[:, 3])
iw = np.maximum(ixmax - ixmin + 1., 0.)
ih = np.maximum(iymax - iymin + 1., 0.)
# 2.calculate the area of inters
inters = iw * ih
# 3.calculate the area of union
uni = ((pred_boxes[:, 2] - pred_boxes[:, 0] + 1.) * (pred_boxes[:, 3] - pred_boxes[:, 1] + 1.) +
(gt_box[2] - gt_box[0] + 1.) * (gt_box[3] - gt_box[1] + 1.) -
inters)
# 4.calculate the overlaps and find the max overlap ,the max overlaps index for pred_box
iou = inters / uni
iou_max = np.max(iou)
nmax = np.argmax(iou)
return iou, iou_max, nmax