I am trying to use GeneralizedRCNNTransform. The forward method takes in a list of images and targets. If I pass in a list of images where one of the images has no bounding boxes, what should the format be?
If I make that item {‘labels’: tensor([]), ‘boxes’: tensor([])} then I get an error:
~/anaconda3/envs/cv/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in forward(self, images, targets)
110 "of shape [C, H, W], got {}".format(image.shape))
111 image = self.normalize(image)
--> 112 image, target_index = self.resize(image, target_index)
113 images[i] = image
114 if targets is not None and target_index is not None:
~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in resize(self, image, target)
161
162 bbox = target["boxes"]
--> 163 bbox = resize_boxes(bbox, (h, w), image.shape[-2:])
164 target["boxes"] = bbox
165
~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in resize_boxes(boxes, original_size, new_size)
277 ]
278 ratio_height, ratio_width = ratios
--> 279 xmin, ymin, xmax, ymax = boxes.unbind(1)
280
281 xmin = xmin * ratio_width
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
and if i make the item None I get
~/anaconda3/envs/cv/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
~/anaconda3/envs/cv/lib/python3.9/site-packages/torchvision/models/detection/transform.py in forward(self, images, targets)
98 for t in targets:
99 data: Dict[str, Tensor] = {}
--> 100 for k, v in t.items():
101 data[k] = v
102 targets_copy.append(data)
AttributeError: 'NoneType' object has no attribute 'items'
in general I am struggling to figure out what the null annotation format is. What should the bounding box label look like if there are no bounding boxes?
When I make the whole targets list None and only have 1 image in the image list, it goes through the transform but gives an error for FasterRCNN.rpn. However, after the tranform if I change targets back from None to {‘labels’: tensor([]), ‘boxes’: tensor([])} it works. But it seems like a hacky solution to change the targets type between the transform and actual network.