Exact coordinates of bounding box for Mask R-CNN?

Given a 5x5 mask (see a.) (black = 0, red = 1) what are the coordinates for its bounding box?

Intuitively I would think of the image as a grid, and pixels being drawn from top left corner, giving the bounding box coords of (2, 2, 4, 4) (see b.).

Some image renderers like plt have me believed that pixels are drawn from the middle, resulting either in box coordinates of (2, 2, 3, 3) (see c.) or possibly (1.5, 1.5, 3.5, 3.5) (see d.).

Which one of these would be correct? Are any of them?

The regions are to be used in the training of a Mask R-CNN model but I guess that the way of calculating bounding boxes are similar for other models too.

Hey @ZimoNitrome, the figure (b) makes sense according to implementations of popular R-CNN.

1 Like