I am currently doing a training with mask rcnn model and all output
masks seem to be really bad compared to the label scores.
In my training set most of the masks have one annotation inside an another. For example:
A = torch.zeros((5, 5)).to(int)
A[1:4, 1:4] = 1
B = torch.zeros(
B[2, 2] = 1
# Is the correct mask for A == A ^ B
Masks A and B would share the center pixel.
Is this incorrect or could something else be currently wrong?
This is perfectly reasonable for Mask R-CNN (although it may indicate
that you are working on a problem that is inherently more difficult to train).
Mask R-CNN performs instance segmentation. It is conceptually fine
to have instance-1 of class-A be contained in instance-2 also of class-A.
You could also have instance-1 of class-A be enclosed in instance-1 of
Mask R-CNN can be applied to such problems (assuming enough training
data of good-enough quality, and so on), although if you told me that such
a use case would tend to be harder to train, I would believe you.
I found the error in my code. I was feeding the model with mask tensor [N, 1, X, Y] in size which is the output shape of the model and not [N, X, Y] that was required as the feed tensor shape.
It seems often peculiar with these models that they often accept oddly shaped inputs. I had a similar error a long time ago when I fed labels in [1, N] tensor and not [N] tensor as was required.