First time posting so here goes.
During a test loop using the COCO dataset, I came across an interesting problem. How do I verify that the ground truth labels in an image match the scores? Now the model gives back a dictionary like predictions[‘scores’] and this is sorted from highest to lowest. The problem happens when computing the accuracy score and verifying that the scores as sorted with the associated labels in predictions[‘labels’] are actually matching the ground truth. A quick example, if the ground truth image contains person, dog, cat but the scores returns horse, sheep, person due to the CNN matching then the accuracy might be 0.33 or 0.0. Its 0.0 if we do predictions[‘labels’] == ground_truth_labels because person or 1 didn’t match. I fixed this using IoU and comparing all the rectangles to make sure that the label and rectangle match to get a True-True result. But is this correct?