Unable to get Masks with Mask R-CNN Model during inference

agn3ya · May 28, 2024, 6:59am

Hi Everyone,
I am working on a project involving Mask R-CNN for video frame analysis. The model is trained on a custom dataset and annotation from CVAT. It works fine for detecting bounding boxes, but I face issues with visualising masks. I am not sure whether the model is generating masks correctly.

Relevant Code:

def postprocess_detection(frames, outputs, threshold=0.5):
    frames = frames.cpu().numpy().transpose(1, 2, 0)

    boxes = outputs['boxes'].cpu().detach().numpy()
    labels = outputs['labels'].cpu().detach().numpy()
    scores = outputs['scores'].cpu().detach().numpy()
    masks = outputs['masks'].cpu().detach().numpy()

    indices = scores >= threshold
    boxes = boxes[indices]
    labels = labels[indices]
    scores = scores[indices]
    masks = masks[indices]

    mask_canvas = np.zeros_like(frames, dtype=np.uint8)

    for i, mask in enumerate(masks):
        mask = mask[0, :, :]
        
        mask = preprocessing.normalize(mask)
        
        # Apply threshold to binarize mask
        mask = (mask > 0.5).astype(np.uint8)
        
        color = [random.randint(0, 255) for _ in range(3)]
        for c in range(3):
            mask_canvas[:, :, c] = np.where(mask == 1, color[c], mask_canvas[:, :, c])
    result_image = cv2.addWeighted(frames, 1, mask_canvas.astype(np.float32), 0.5, 0)

    return result_image, mask_canvas, boxes, labels, scores

Issue:

Bounding Boxes are drawn correctly
Masks and not visible and I am not sure whether they are being generated correctly

Additional Details:

The masks are thresholded and resized before blending.
The output shape and type of the image array are checked and seem to be correct.
The device used for inference is CUDA.