With torchvision’s pre-trained mask-rcnn model, trying to train on a custom dataset prepared in COCO format.
Using torch/vision/detection/engine’s train_one_epoch
and evaluate
methods for training and evaluation, respectively.
The loss_mask
metric is reducing as can be seen here:
Epoch: [5] [ 0/20] eta: 0:00:54 lr: 0.005000 loss: 0.5001 (0.5001) loss_classifier: 0.2200 (0.2200) loss_box_reg: 0.2616 (0.2616) loss_mask: 0.0014 (0.0014) loss_objectness: 0.0051 (0.0051) loss_rpn_box_reg: 0.0120 (0.0120) time: 2.7308 data: 1.2866 max mem: 9887
Epoch: [5] [10/20] eta: 0:00:26 lr: 0.005000 loss: 0.4734 (0.4982) loss_classifier: 0.2055 (0.2208) loss_box_reg: 0.2515 (0.2595) loss_mask: 0.0012 (0.0013) loss_objectness: 0.0038 (0.0054) loss_rpn_box_reg: 0.0094 (0.0113) time: 2.6218 data: 1.1780 max mem: 9887
Epoch: [5] [19/20] eta: 0:00:02 lr: 0.005000 loss: 0.5162 (0.5406) loss_classifier: 0.2200 (0.2384) loss_box_reg: 0.2616 (0.2820) loss_mask: 0.0014 (0.0013) loss_objectness: 0.0051 (0.0062) loss_rpn_box_reg: 0.0120 (0.0127) time: 2.6099 data: 1.1755 max mem: 9887
But the evaluate
output shows absolutely no improvement from zero for IoU segm
metric:
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.653
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.843
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.723
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.788
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.325
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.701
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.738
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.739
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.832
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.456
IoU metric: segm
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
The segm metrics don’t improve even after training 500 epochs.
And, the masks that I get as output after training for 100 or 500 epochs, if I visualize, they are showing a couple of dots here and there.