Extracting Bounding Box Coordinates from mask

I have a model that predicts a binary mask. I want to extract the bounding box coordinates without calling cpu or numpy. Is there any efficient way to do this in pytorch??

Hi @Sabbir_Ahmed,

What format do you want the coordinates to be extracted in?

You can access the raw data pointer with


then you can cast/interpret it as a ctype with ctypes:

import ctypes
c_type_pointer = ctypes.c_void_p(t.data_ptr())
# Then you can process or copy from c_type_pointer
#I have a predicted mask
pred_mask = model(input)

# I just want an efficient function that maps the mask to coordinates of bounding boxes without calling cpu 
coordinates = function(pred_mask)

And i am sorry, but i don’t have a strong understanding of raw data pointer.
Could u please suggest a code snippet of the function in pytorch @spanev.

You can reference the code below

def extract_bboxes(mask):

    """Compute bounding boxes from masks.

    mask: [height, width, num_instances]. Mask pixels are either 1 or 0.


    Returns: bbox array [num_instances, (y1, x1, y2, x2)].


    boxes = np.zeros([mask.shape[-1], 4], dtype=np.int32)

    for i in range(mask.shape[-1]):

        m = mask[:, :, i]

        # Bounding box.

        horizontal_indicies = np.where(np.any(m, axis=0))[0]

        print("np.any(m, axis=0)",np.any(m, axis=0))

        print("p.where(np.any(m, axis=0))",np.where(np.any(m, axis=0)))

        vertical_indicies = np.where(np.any(m, axis=1))[0]

        if horizontal_indicies.shape[0]:

            x1, x2 = horizontal_indicies[[0, -1]]

            y1, y2 = vertical_indicies[[0, -1]]

            # x2 and y2 should not be part of the box. Increment by 1.

            x2 += 1

            y2 += 1


            # No mask for this instance. Might happen due to

            # resizing or cropping. Set bbox to zeros

            x1, x2, y1, y2 = 0, 0, 0, 0

        boxes[i] = np.array([y1, x1, y2, x2])

    return boxes.astype(np.int32)

Hope it will help you~ @Sabbir_Ahmed


Thanks if i want to also capture along with bounding box some area around it … what should i do.
intent is to capture bounding box image portion and vicinity around it

Using skimage, np and pandas (pandas is not necessary:

from typing import Dict, List, Optional

import numpy as np
import pandas as pd
from skimage.measure import label, regionprops
import torch as th

def simple_boxing(
    classmasks: th.Tensor,
    file_ids: Optional[List[int]] = None,
    channel2class: Dict[int, int] = None,
) -> pd.DataFrame:
    Convert a tensor of shape (B, C, H, W) to a pandas DataFrame with bounding boxes.

        classmasks: Tensor with shape (batch_size, channels, height, width)
    assert classmasks.dtype == th.bool, "classmasks data type must be boolean."
    assert (
        len(classmasks.shape) == 4
    ), f"classmasks with shape {classmasks.shape} should have 4 dimensions"

    # to reduce timing, skip all empty channels
    has_detections = (
        classmasks.view(classmasks.shape[0:2] + (-1,))

    boxes_list = []
    classmasks = classmasks.cpu().numpy()

    for img_idx, channel_idx in has_detections:
        labels = label(classmasks[img_idx, channel_idx], background=0, connectivity=2)
        props = regionprops(labels)
        file_id = file_ids[img_idx] if file_ids is not None else img_idx
        boxes_list.extend([(file_id, *x.bbox, channel_idx) for x in props])

    boxes = pd.DataFrame(boxes_list, columns=COLUMN_NAMES_SEGMENTATION)

    if channel2class is not None:
        boxes["class_id"] = boxes["channel_id"].map(channel2class)

    return boxes

Just in case someone needs it.