Mask RCNN number of predictions

ATAboukhadra · July 19, 2022, 10:36am

Hi,

I’m using the Keypoint RCNN from torchvision to estimate bounding boxes and keypoints for objects in RGB images. In my dataset, it’s guaranteed that there is 1 and only 1 object in the image. Therefore, I’m not interested in the ability of the RCNN to produce all possible relevant RoIs, I’m only interested in only 1.

To do so, I set rpn_post_nms_top_n_train to 1. This works fine but sometimes the RPN produces 2 RoIs per image during training even though I specified that I only need 1. How do I fix that? What is this
second RoI?

My second question is related to the same problem. With more training, the RCNN shows a pattern of overfitting where it stops giving predictions to avoid more loss so the number of RoIs becomes 0 for some testing images.
It’s guaranteed that every sample in the test set has an object. How do I penalize the RCNN if it doesn’t produce any outputs or how can I guarantee that it gives exactly 1 prediction per image (no more no less)?

Another related question:
Which parts of the RCNN can be freezed to maintain the number of predictions per image while finetuning the rest of it?

Thanks a lot.