Mask rcnn can not detect ROI class probability, the probability is background all the time

Dear,
I’m going to use Mask rcnn inference demo with pre-trained model using image of COCO dataset
I’m following this : GitHub - multimodallearning/pytorch-mask-rcnn

Replace C++ embeding python. NMS and ROIalign
by torchvision.ops.nms() and torchvision.ops.roi_align()

My config is : python 3.8 torch 1.9.1+cu111, torchvision 0.10.1+cu111

In the “FPN Head classifier” detect ROI mrcnn probalitity always is background
So next step that refine the ROI,
class probability of the top class of each ROI, filter out background boxes is error
because all class probability are background

The error code is :
D:\Ru\pytorch-mask-rcnn\test\model_1028.py:1687: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
molded_images = Variable(molded_images, volatile=True)
C:\Users\admin\anaconda3\envs\torch\lib\site-packages\torch\nn\functional.py:3487: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn(“nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.”)
Traceback (most recent call last):
File “D:/Ru/pytorch-mask-rcnn/test/demo_1028.py”, line 99, in
results = model.detect([image])
File “D:\Ru\pytorch-mask-rcnn\test\model_1028.py”, line 1690, in detect
detections, mrcnn_mask = self.predict([molded_images, image_metas], mode=‘inference’)
File “D:\Ru\pytorch-mask-rcnn\test\model_1028.py”, line 1766, in predict
detections = detection_layer(self.config, rpn_rois, mrcnn_class, mrcnn_bbox, image_metas)
File “D:\Ru\pytorch-mask-rcnn\test\model_1028.py”, line 919, in detection_layer
detections = refine_detections(rois, mrcnn_class, mrcnn_bbox, window, config)
File “D:\Ru\pytorch-mask-rcnn\test\model_1028.py”, line 890, in refine_detections
keep = intersect1d(keep, nms_keep)
UnboundLocalError: local variable ‘nms_keep’ referenced before assignment

What do you think? What’s happening?

This error claims that nms_keep is used before its creation/assignmen, so check your code and make sure nms_keep is properly initialized before its usage. This is often caused when variables are initialized inside an if condition, which is then never triggered.

Thanks for reply
The nms_keep is create after if condition

def refine_detections(rois, probs, deltas, window, config):
    """Refine classified proposals and filter overlaps and return final
    detections.

    Inputs:
        rois: [N, (y1, x1, y2, x2)] in normalized coordinates
        probs: [N, num_classes]. Class probabilities.
        deltas: [N, num_classes, (dy, dx, log(dh), log(dw))]. Class-specific
                bounding box deltas.
        window: (y1, x1, y2, x2) in image coordinates. The part of the image
            that contains the image excluding the padding.

    Returns detections shaped: [N, (y1, x1, y2, x2, class_id, score)]
    """

    # Class IDs per ROI
    _, class_ids = torch.max(probs, dim=1)

    # Class probability of the top class of each ROI
    # Class-specific bounding box deltas
    idx = torch.arange(class_ids.size()[0]).long()
    if config.GPU_COUNT:
        idx = idx.cuda()
    class_scores = probs[idx, class_ids.data]
    deltas_specific = deltas[idx, class_ids.data]

    # Apply bounding box deltas
    # Shape: [boxes, (y1, x1, y2, x2)] in normalized coordinates
    std_dev = Variable(torch.from_numpy(np.reshape(config.RPN_BBOX_STD_DEV, [1, 4])).float(), requires_grad=False)
    if config.GPU_COUNT:
        std_dev = std_dev.cuda()
    refined_rois = apply_box_deltas(rois, deltas_specific * std_dev)

    # Convert coordiates to image domain
    height, width = config.IMAGE_SHAPE[:2]
    scale = Variable(torch.from_numpy(np.array([height, width, height, width])).float(), requires_grad=False)
    if config.GPU_COUNT:
        scale = scale.cuda()
    refined_rois *= scale

    # Clip boxes to image window
    refined_rois = clip_to_window(window, refined_rois)

    # Round and cast to int since we're deadling with pixels now
    refined_rois = torch.round(refined_rois)

    # TODO: Filter out boxes with zero area

    # Filter out background boxes
    keep_bool = class_ids>0

    # Filter out low confidence boxes
    if config.DETECTION_MIN_CONFIDENCE:
        keep_bool = keep_bool & (class_scores >= config.DETECTION_MIN_CONFIDENCE)
    keep = torch.nonzero(keep_bool)[:,0]

    # Apply per-class NMS
    pre_nms_class_ids = class_ids[keep.data]
    pre_nms_scores = class_scores[keep.data]
    pre_nms_rois = refined_rois[keep.data]

    for i, class_id in enumerate(unique1d(pre_nms_class_ids)):
        # Pick detections of this class
        ixs = torch.nonzero(pre_nms_class_ids == class_id)[:,0]

        # Sort
        ix_rois = pre_nms_rois[ixs.data]
        ix_scores = pre_nms_scores[ixs]
        ix_scores, order = ix_scores.sort(descending=True)
        ix_rois = ix_rois[order.data,:]

        # class_keep = nms(torch.cat((ix_rois, ix_scores.unsqueeze(1)), dim=1).data, config.DETECTION_NMS_THRESHOLD)
        class_keep = torchvision.ops.nms(ix_rois, ix_scores, config.DETECTION_NMS_THRESHOLD)

        # Map indicies
        class_keep = keep[ixs[order[class_keep].data].data]

        if i==0:
            nms_keep = class_keep
        else:
            nms_keep = unique1d(torch.cat((nms_keep, class_keep)))
    keep = intersect1d(keep, nms_keep)

This is my code.

In the Filter out background boxes part

    keep_bool = class_ids>0

All of class_ids are zero
I feel weird

############################################################
#  Feature Pyramid Network Heads
############################################################

class Classifier(nn.Module):
    def __init__(self, depth, pool_size, image_shape, num_classes):
        super(Classifier, self).__init__()
        self.depth = depth
        self.pool_size = pool_size
        self.image_shape = image_shape
        self.num_classes = num_classes
        self.conv1 = nn.Conv2d(self.depth, 1024, kernel_size=self.pool_size, stride=1)
        self.bn1 = nn.BatchNorm2d(1024, eps=0.001, momentum=0.01)
        self.conv2 = nn.Conv2d(1024, 1024, kernel_size=1, stride=1)
        # self.conv2 = nn.Conv2d(self.depth, 1024, kernel_size=1, stride=1)
        self.bn2 = nn.BatchNorm2d(1024, eps=0.001, momentum=0.01)
        self.relu = nn.ReLU(inplace=True)
        self.linear_class = nn.Linear(1024, num_classes)
        self.softmax = nn.Softmax(dim=1)
        self.linear_bbox = nn.Linear(1024, num_classes * 4)

    def forward(self, x, rois):
        x = pyramid_roi_align([rois]+x, self.pool_size, self.image_shape)
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.relu(x)

        x = x.view(-1, 1024)
        mrcnn_class_logits = self.linear_class(x)
        mrcnn_probs = self.softmax(mrcnn_class_logits)

        mrcnn_bbox = self.linear_bbox(x)
        mrcnn_bbox = mrcnn_bbox.view(mrcnn_bbox.size()[0], -1, 4)

        return [mrcnn_class_logits, mrcnn_probs, mrcnn_bbox]

As previously described, often wrong if conditions are used which are never met. In your case this also seems to be the case here:

        if i==0:
            nms_keep = class_keep
        else:
            nms_keep = unique1d(torch.cat((nms_keep, class_keep)))

Add a print statement to the if i==0 case and you would see that it most likely won’t be executed before the else branch.

Umm…
I think this issue is from architecture.
Do you understand Mask rcnn how to work?

############################################################
#  Feature Pyramid Network Heads
############################################################
x = x.view(-1, 1024)
        mrcnn_class_logits = self.linear_class(x)
        mrcnn_probs = self.softmax(mrcnn_class_logits)

All of mrcnn_probability are background.

def refine_detections(rois, probs, deltas, window, config):
# Class IDs per ROI
    _, class_ids = torch.max(probs, dim=1)

The background class_id is 0
So input to next step, can not filter out background.

Thanks again for your support. I feel helpless.