COCO Detection Dataset boxes annotations shifted

JVGD · September 11, 2020, 9:05am

Hi all, I am writing to see if you can help me. I am trying to set the COCO Detection dataset to work for some experiments. However when validating the images and annotations I find that the bounding boxes are shifted. It will probably be a bug in my code but… I just can’t find it, and since the code is so simple… I am starting to think it could be the annotations or something in the CocoDetection class itself.

Could anyone help me?


class COCODataset(CocoDetection):
    """Dataset class for Microsoft COCO dataset
    """
    def __init__(self, ds_path,
                 images='val2017',
                 annots='annotations/instances_val2017.json'):
        # Getting paths and init base class
        images_path = os.path.join(ds_path, images)
        annots_path = os.path.join(ds_path, annots)
        super().__init__(images_path, annots_path)

    def __getitem__(self, index):
        """Get the image and the boxes for the sample 'index'
        """
        # Getting sample and target
        sample, target = super().__getitem__(index)

        # Ground truth coding: image and boxes
        # Format the PIL image as a numpy ndarray
        image = np.array(sample)
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

        # Getting bounding boxes for objects in image
        boxes = []
        for obj in target:
            # Reading ground truth
            box_x = obj['bbox'][0]
            box_y = obj['bbox'][1]
            box_w = obj['bbox'][2]
            box_h = obj['bbox'][3]
            idx = obj['id']

            # Formatting ground truth and storing it
            box = [box_x, box_y, box_w, box_h, idx]
            boxes.append(box)

        # Drawing
        for box in boxes:
            box_x = int(box[0])
            box_y = int(box[1])
            box_w = int(box[2])
            box_h = int(box[3])

            # Draw center point
            cv2.circle(
                img=image,
                center=(box_x, box_y),
                radius=1,
                color=(0, 0, 255),
                thickness=3
            )

            # Draw rectangle: from [x,y,w,h] to [xmin,ymin,xmax,ymax]
            cv2.rectangle(
                img=image,
                pt1=(box_x-box_w//2, box_y-box_h//2),
                pt2=(box_x+box_w//2, box_y+box_h//2),
                color=(255, 0, 0),
                thickness=3
            )

        # Saving image
        filename = os.path.join('./', 'image_{}.png'.format(index))
        print('Writing test image: ', filename)
        cv2.imwrite(filename, image)


if __name__ == '__main__':
    # Testing
    ds_path = '/home/jvgd/workspace/datasets/raw/COCO'
    valid_images = os.path.join(ds_path, 'val2017')
    valid_annots = os.path.join(ds_path, 'annotations/instances_val2017.json')
    dataset = COCODataset(ds_path,
        images=valid_images,
        annots=valid_annots)
    
    # Testing the dataset by writing images with annots
    for i, _ in enumerate(dataset):
        if i >= 10: break

This assumes you have the COCO validation images and annots from 2017 challenge: https://cocodataset.org/#download

pfloat · September 11, 2020, 9:36am

Hello,
it seems like the code

            cv2.rectangle(
                img=image,
                pt1=(box_x-box_w//2, box_y-box_h//2),
                pt2=(box_x+box_w//2, box_y+box_h//2),
                color=(255, 0, 0),
                thickness=3
            )

is considering box_x and box_y as the center of the rectangle but in the COCO dataset, it represents the top left corner of the rectangle. So, the rectangle drawing should be something like this?

            cv2.rectangle(
                img=image,
                pt1=(box_x, box_y), # top left corner
                pt2=(box_x+box_w, box_y+box_h), # bottom right corner
                color=(255, 0, 0),
                thickness=3
            )

JVGD · September 14, 2020, 8:11am

Thank you for such a quick response and for the solution indeed!!

It was as simple as that, COCO datset format is: [x, y, w, h]. Where, x, and y are the top left point and not the center point as I thought. I have been working a lot with PASCAL VOC format so somehow I misinterpreted that the COCO format was center point + width and height. My bad.

Than you