GeneralizedRCNNTransform Null Annotation Format

I am trying to use GeneralizedRCNNTransform. The forward method takes in a list of images and targets. If I pass in a list of images where one of the images has no bounding boxes, what should the format be?
If I make that item {‘labels’: tensor([]), ‘boxes’: tensor([])} then I get an error:

~/anaconda3/envs/cv/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in forward(self, images, targets)
    110                                  "of shape [C, H, W], got {}".format(image.shape))
    111             image = self.normalize(image)
--> 112             image, target_index = self.resize(image, target_index)
    113             images[i] = image
    114             if targets is not None and target_index is not None:

~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in resize(self, image, target)
    161 
    162         bbox = target["boxes"]
--> 163         bbox = resize_boxes(bbox, (h, w), image.shape[-2:])
    164         target["boxes"] = bbox
    165 

~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in resize_boxes(boxes, original_size, new_size)
    277     ]
    278     ratio_height, ratio_width = ratios
--> 279     xmin, ymin, xmax, ymax = boxes.unbind(1)
    280 
    281     xmin = xmin * ratio_width

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

and if i make the item None I get

~/anaconda3/envs/cv/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in forward(self, images, targets)
     98             for t in targets:
     99                 data: Dict[str, Tensor] = {}
--> 100                 for k, v in t.items():
    101                     data[k] = v
    102                 targets_copy.append(data)

AttributeError: 'NoneType' object has no attribute 'items'

in general I am struggling to figure out what the null annotation format is. What should the bounding box label look like if there are no bounding boxes?

When I make the whole targets list None and only have 1 image in the image list, it goes through the transform but gives an error for FasterRCNN.rpn. However, after the tranform if I change targets back from None to {‘labels’: tensor([]), ‘boxes’: tensor([])} it works. But it seems like a hacky solution to change the targets type between the transform and actual network.

I havenot tried the Faster RCNN with null input. But it seems like the code is not supported for it. you mentioned about the image having no bounding box and labels are being used as input for the training. Why dont you filter out the images that has no labels before starting the training. These images wont contribute for the training anyway as they dont have any labels.