What should I put in my bbox when there is no object to detect in the image?

Hi everyone, i have a custom dataset for a binary object detection problem. I use the pytorch model ’ `` fasterrcnn_resnet50_fpn. For this, as it said in the doc i need a target dictionnary containing ‘boxes’ and ‘label’.
There is an error in case my class is 0 (no object to detect), so I have no bbox to put in targets[‘boxes’].
As a temporary solution I set a value of bbox=[0,0,0,0] when there isn’t object in my image.
Is it a correct way to do ?

def my_transform(self,img,bboxes,label):

    if self.transforms is not None and self.train:
        transformed = self.transforms(image=np.array(img),bboxes=bboxes) # bbox can be an empty list 
        img = transformed['image']
        bboxes = transformed['bboxes']
    img = TF.to_tensor(img)
    if self.normalize:
        img = TF.normalize(img,self.mean,self.std)
    label = torch.LongTensor(label)
    bboxes = [l[:4] for l in bboxes] #we dont need name of the object anymore 
    bboxes = torch.FloatTensor(bboxes)
    targets = {'boxes':bboxes,'labels':label}
    return img, targets

def __getitem__(self,index):
    img = Image.open(os.path.join(self.root_images,self.file_names[index]+'.png')).convert('RGB')
    with open(os.path.join(self.root_annotation,self.file_names[index]+'.json'),'r') as f:
        annotation = json.loads(f.read().strip())
    bboxes = []
    label = []
    try:
        for b in annotation['outputs']['object']:
            bbox = [i for i in b['bndbox'].values()] #xmin ymin xmax ymax #VOC Format
            bbox.append(b['name']) # Le label ici est exclusivement drone pour l'instant 
            bboxes.append(bbox)
            label.append(1)
    except:
        raise RuntimeError('Something wrong with the annotation format')
    
    if len(annotation['outputs']['object']) == 0:
        label = [0] # if no object label is 0
        bboxes = [[0,0,0,0]]
    img, targets = self.my_transform(img, bboxes,label)
    return img,targets

Thank you for your help

Hi @4bach, I’m facing a similar problem. I want to train my model with some unannotated images and I don’t know what to do with them. I tried to stick to the TorchVision Object Detection Finetuning Tutorial where a dictionary is created for the target that contains bounding boxes and labels among other things. However, it is not really about empty images.

Have you found a solution? Have you tried setting your bounding boxes to None or something like that? I tried to use bbox = torch.zeros((0, 4), dtype=torch.float32) without success.