As I know we should label all objects we have in an image for certain category since otherwise loss function will penalize detections which are correct but not present in the target.
But it doesn’t work thus in practice for me.
I’m using torchvision.models.detection.fasterrcnn_resnet50_fpn
for custom object detection.
I have images which contain from 1 to ~20 objects.
I tried two different approaches for creating target
variable:
-
target
variable consists of only one object every time:
[{'boxes': tensor([[313, 34, 369, 62]], device='cuda:0'),
'labels': tensor([13], device='cuda:0')}]
-
target
variable consists of all objects that are present in image:
[{'boxes': tensor([[313, 34, 369, 62], [332, 244, 389, 274]], device='cuda:0'),
'labels': tensor([13, 13], device='cuda:0')}]
In that specific task the smallest loss I could get is when create each individual target for each individual object (first case).
It doesn’t fit with theory so can someone tell me what I am doing wrong?