Extracting Bounding Box Coordinates from mask

Sabbir_Ahmed · November 16, 2019, 12:41pm

I have a model that predicts a binary mask. I want to extract the bounding box coordinates without calling cpu or numpy. Is there any efficient way to do this in pytorch??

spanev · November 16, 2019, 3:56pm

Hi @Sabbir_Ahmed,

What format do you want the coordinates to be extracted in?

You can access the raw data pointer with

t.data_ptr()

then you can cast/interpret it as a ctype with ctypes:

import ctypes
c_type_pointer = ctypes.c_void_p(t.data_ptr())
# Then you can process or copy from c_type_pointer

Sabbir_Ahmed · November 17, 2019, 4:39am

#I have a predicted mask
pred_mask = model(input)

# I just want an efficient function that maps the mask to coordinates of bounding boxes without calling cpu 
coordinates = function(pred_mask)

And i am sorry, but i don’t have a strong understanding of raw data pointer.
Could u please suggest a code snippet of the function in pytorch @spanev.

Qing_En · April 30, 2020, 5:36am

You can reference the code below

def extract_bboxes(mask):

    """Compute bounding boxes from masks.

    mask: [height, width, num_instances]. Mask pixels are either 1 or 0.

 

    Returns: bbox array [num_instances, (y1, x1, y2, x2)].

    """

    boxes = np.zeros([mask.shape[-1], 4], dtype=np.int32)

    for i in range(mask.shape[-1]):

        m = mask[:, :, i]

        # Bounding box.

        horizontal_indicies = np.where(np.any(m, axis=0))[0]

        print("np.any(m, axis=0)",np.any(m, axis=0))

        print("p.where(np.any(m, axis=0))",np.where(np.any(m, axis=0)))

        vertical_indicies = np.where(np.any(m, axis=1))[0]

        if horizontal_indicies.shape[0]:

            x1, x2 = horizontal_indicies[[0, -1]]

            y1, y2 = vertical_indicies[[0, -1]]

            # x2 and y2 should not be part of the box. Increment by 1.

            x2 += 1

            y2 += 1

        else:

            # No mask for this instance. Might happen due to

            # resizing or cropping. Set bbox to zeros

            x1, x2, y1, y2 = 0, 0, 0, 0

        boxes[i] = np.array([y1, x1, y2, x2])

    return boxes.astype(np.int32)

Hope it will help you~ @Sabbir_Ahmed

Jaideep_Valani · June 1, 2020, 7:58am

@QingEn
Thanks if i want to also capture along with bounding box some area around it … what should i do.
intent is to capture bounding box image portion and vicinity around it

Jean_Da_Rolt · July 14, 2021, 7:19pm

Using skimage, np and pandas (pandas is not necessary:

from typing import Dict, List, Optional

import numpy as np
import pandas as pd
from skimage.measure import label, regionprops
import torch as th

def simple_boxing(
    classmasks: th.Tensor,
    file_ids: Optional[List[int]] = None,
    channel2class: Dict[int, int] = None,
) -> pd.DataFrame:
    """
    Convert a tensor of shape (B, C, H, W) to a pandas DataFrame with bounding boxes.

    Args:
        classmasks: Tensor with shape (batch_size, channels, height, width)
    """
    assert classmasks.dtype == th.bool, "classmasks data type must be boolean."
    assert (
        len(classmasks.shape) == 4
    ), f"classmasks with shape {classmasks.shape} should have 4 dimensions"

    # to reduce timing, skip all empty channels
    has_detections = (
        classmasks.view(classmasks.shape[0:2] + (-1,))
        .any(dim=2)
        .nonzero()
        .cpu()
        .numpy()
    )

    boxes_list = []
    classmasks = classmasks.cpu().numpy()

    for img_idx, channel_idx in has_detections:
        labels = label(classmasks[img_idx, channel_idx], background=0, connectivity=2)
        props = regionprops(labels)
        file_id = file_ids[img_idx] if file_ids is not None else img_idx
        boxes_list.extend([(file_id, *x.bbox, channel_idx) for x in props])

    boxes = pd.DataFrame(boxes_list, columns=COLUMN_NAMES_SEGMENTATION)

    if channel2class is not None:
        boxes["class_id"] = boxes["channel_id"].map(channel2class)

    return boxes

Just in case someone needs it.

Peter_O_Connor · August 11, 2022, 3:08pm

The above answers address how to do it when you just want one box per mask. If you stumbled upon this question looking for how to predict multiple boxes per mask (e.g. when regions are disconnected) see:

Jinx_L · February 21, 2023, 3:24am

Suppose mask_np is the numpy array from a binary mask, then the following codes will help you obtain the bounding box coordinates:

# the fuction
def bounding_box(img):
    rows = np.any(img, axis=1)
    cols = np.any(img, axis=0)
    rmin, rmax = np.where(rows)[0][[0, -1]]
    cmin, cmax = np.where(cols)[0][[0, -1]]
    return rmin, rmax, cmin, cmax # y1, y2, x1, x2 

# process the mask array with the above function
bounding_box(img=mask_up)