I am trying to evaluate average precision for a face detection algorithm. I have created a CSV file with all the predicted bounding boxes. I have another CSV in a similar format with all the ground-truth bounding boxes. I found a lot of simple codes online to calculate bounding box overlap and calculating AP but they don’t work for a variable number of predictions given ground-truth i.e. if an image has 8 bounding box predictions whereas ground truths are 10 then how do I calculate the average precision?
I tried to mimic one of the algorithms and create a CSV file in the format below:
# Filename-1.jpg [ x1, y1, x2, y2 ] [ x1, y1, x2, y2 ] [ x1, y1, x2, y2 ] [ x1, y1, x2, y2 ] # Filename-2.jpg [ x1, y1, x2, y2 ] [ x1, y1, x2, y2 ] [ x1, y1, x2, y2 ] # Filename-3.jpg [ x1, y1, x2, y2 ] [ x1, y1, x2, y2 ] ... ... ...
I am also sharing what I tried so far:
def get_max_iou(pred_boxes, gt_box): """ calculate the iou multiple pred_boxes and 1 gt_box (the same one) pred_boxes: multiple predict boxes coordinate gt_box: ground truth bounding box coordinate return: the max overlaps about pred_boxes and gt_box """ # 1. calculate the inters coordinate if pred_boxes.shape > 0: ixmin = np.maximum(pred_boxes[:, 0], gt_box[:, 0]) ixmax = np.minimum(pred_boxes[:, 2], gt_box[:, 2]) iymin = np.maximum(pred_boxes[:, 1], gt_box[:, 1]) iymax = np.minimum(pred_boxes[:, 3], gt_box[:, 3]) iw = np.maximum(ixmax - ixmin + 1., 0.) ih = np.maximum(iymax - iymin + 1., 0.) # 2.calculate the area of inters inters = iw * ih # 3.calculate the area of union uni = ((pred_boxes[:, 2] - pred_boxes[:, 0] + 1.) * (pred_boxes[:, 3] - pred_boxes[:, 1] + 1.) + (gt_box - gt_box + 1.) * (gt_box - gt_box + 1.) - inters) # 4.calculate the overlaps and find the max overlap ,the max overlaps index for pred_box iou = inters / uni iou_max = np.max(iou) nmax = np.argmax(iou) return iou, iou_max, nmax