I am going to fine-tune DETR on my dataset and I need to add some additional augmentation using albumentations. The question is, how should I prepare the targets format in
__getitem__? Should it be in yolo, coco or pascal_voc format?
The original dataset uses coco format like
[xmin, ymin, w, h] but I saw in dataset format that it converts to normalized
boxes[:, 2:] += boxes[:, :2]
But is post processing, I see a function that converts
def box_cxcywh_to_xyxy(): ...
So, in which format should I prepare the target bboxes so that they compare to predicted bboxes?