Mask RCNN number of predictions


I’m using the Keypoint RCNN from torchvision to estimate bounding boxes and keypoints for objects in RGB images. In my dataset, it’s guaranteed that there is 1 and only 1 object in the image. Therefore, I’m not interested in the ability of the RCNN to produce all possible relevant RoIs, I’m only interested in only 1.

To do so, I set rpn_post_nms_top_n_train to 1. This works fine but sometimes the RPN produces 2 RoIs per image during training even though I specified that I only need 1. How do I fix that? What is this
second RoI?

My second question is related to the same problem. With more training, the RCNN shows a pattern of overfitting where it stops giving predictions to avoid more loss so the number of RoIs becomes 0 for some testing images.
It’s guaranteed that every sample in the test set has an object. How do I penalize the RCNN if it doesn’t produce any outputs or how can I guarantee that it gives exactly 1 prediction per image (no more no less)?

Another related question:
Which parts of the RCNN can be freezed to maintain the number of predictions per image while finetuning the rest of it?

Thanks a lot.

1 Like