I (also) dont understand the output of Mask-RCNN evaluation

Hello
I just switched from TensorFlow to PyTorch, so I know my neural networks, but not so much PyTorch.

When I do an evaluate() after an epoch of training with Mask R-CNN (using ResNet-50) I get this kind of output:

IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.316
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.429
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.377
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.178
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.342
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.202
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.595
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.619
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.496
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.640
IoU metric: segm
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.332
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.430
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.388
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.130
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.363
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.210
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.617
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.640
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.494
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.664

Precision and recall values should converge to 1.0, and “bbox” and “segm” are probably the results for bounding box and masks, ok.
I know what precision, recall and IoU means, but the rest is very unclear to me.

But what is “(AR)”, “IoU=0.50:0.95”, “maxDets”?
And what does “-1” indicate for precision and recall?

Can someone explain these values to me?
I tried to find some documentation about this but failed (maybe someone should document that!?).

Regards,
Bernd

It’s all in there COCO - Common Objects in Context.

1 Like

thank you very much! This is really great!

@vriez what does 100 detections per image mean? I dont have any image that requires that many detection