Collate Issue with Fast RCNN (tries to transform dictionary in GeneralizedRCNN)

I am trying to create a custom dataset to pass into torch.utils.data.DataLoader for training: https://pytorch.org/docs/stable/torchvision/models.html#faster-r-cnn

The instructions indicate they want a list of images and a list of (list[Dict[Tensor]]) for the targets with ‘boxes’ and ‘labels’. Now I create this in my dataset getitem which returns the image as a tensor and a dictionary with keys “boxes” and “labels” and a list of 4 floats for “boxes” and an int for “labels”.

When I try training I run into the following issues:

  File "/home/mschiappa/Desktop/VisualRelationshipsDetection/run_exp.py", line 118, in run_model
    outputs = model(inputs, targets)
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/generalized_rcnn.py", line 47, in forward
    images, targets = self.transform(images, targets)
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/transform.py", line 36, in forward
    target = targets[i] if targets is not None else targets
KeyError: 0

and when I print what the targets look like:

{'boxes': tensor([[0.6547, 1.0000, 0.4703, 0.4938],
        [0.7802, 0.9403, 0.0472, 0.0796],
        [0.9897, 0.9823, 0.0000, 0.4879],
        [1.0000, 0.4943, 0.8626, 0.3522],
        [1.0000, 1.0000, 0.0000, 0.0000],
        [1.0000, 0.2750, 0.8687, 0.0000],
        [0.9956, 0.7544, 0.8100, 0.5774],
        [1.0000, 0.9531, 0.0000, 0.1137]]), 'labels': tensor([227, 445, 432, 307, 390, 281, 419, 333])}

When I look into the GeneralizedRCNN code which is inherited by the fastrcnn, it tries to transform the targets: https://github.com/pytorch/vision/blob/master/torchvision/models/detection/generalized_rcnn.py#L64
Which shouldn’t work since it is a dictionary, and it appears to be where it is failing.

I am unsure how to fix this without modifying the PyTorch fastrcnn code.

Currently making progress with the fix:

# images, targets = self.transform(images, targets)
images, _ = self.transform(images)

to:

Could you try to pass the target as a list of dicts, where each dict in the list corresponds to the boxes and labels for the element in the data list.

EDIT: This code should work:

model = models.detection.fasterrcnn_resnet50_fpn()

x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
y = [{'boxes': torch.tensor([[0.6547, 1.0000, 0.4703, 0.4938],
        [0.7802, 0.9403, 0.0472, 0.0796],
        [0.9897, 0.9823, 0.0000, 0.4879],
        [1.0000, 0.4943, 0.8626, 0.3522],
        [1.0000, 1.0000, 0.0000, 0.0000],
        [1.0000, 0.2750, 0.8687, 0.0000],
        [0.9956, 0.7544, 0.8100, 0.5774],
        [1.0000, 0.9531, 0.0000, 0.1137]]),
     'labels': torch.tensor([227, 445, 432, 307, 390, 281, 419, 333])},
    {'boxes': torch.tensor([[0.6547, 1.0000, 0.4703, 0.4938],
        [0.7802, 0.9403, 0.0472, 0.0796],
        [0.9897, 0.9823, 0.0000, 0.4879],
        [1.0000, 0.4943, 0.8626, 0.3522],
        [1.0000, 1.0000, 0.0000, 0.0000],
        [1.0000, 0.2750, 0.8687, 0.0000],
        [0.9956, 0.7544, 0.8100, 0.5774],
        [1.0000, 0.9531, 0.0000, 0.1137]]),
     'labels': torch.tensor([227, 445, 432, 307, 390, 281, 419, 333])}
]

output = model(x, y)

Now getting this error:

ImportError: /usr/lib/python3/dist-packages/torchvision/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

This leads to the same error as if I change the model:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/generalized_rcnn.py", line 52, in forward
    proposals, proposal_losses = self.rpn(images, features, targets)
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/rpn.py", line 411, in forward
    boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/rpn.py", line 336, in filter_proposals
    keep = box_ops.batched_nms(boxes, scores, lvl, self.nms_thresh)
  File "/usr/lib/python3/dist-packages/torchvision/ops/boxes.py", line 72, in batched_nms
    keep = nms(boxes_for_nms, scores, iou_threshold)
  File "/usr/lib/python3/dist-packages/torchvision/ops/boxes.py", line 32, in nms
    _C = _lazy_import()
  File "/usr/lib/python3/dist-packages/torchvision/extension.py", line 12, in _lazy_import
    from torchvision import _C as C
ImportError: /usr/lib/python3/dist-packages/torchvision/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

also that is what I want to do but when I use the dataset class and make my own custom getitem and a I return the image and a dictionary, for some reason it collates them. How do I stop this so it returns the same format as you provided when using the DataLoader class to iterate through?

Could you post the modification to your model, so that we have a baseline we could debug against, please?

It won’t work because later code still requires it in that format. The summary is that I need to find out why the DataLoader keeps collating my batch into one big dictionary where the keys have a list of the batches values instead of a list of dictionaries, one per sample.

My getitem is returning image, mydict but when the DataLoader gets a batch it becomes [image1, image2], {key1: [target1, target2], key2: [target1, target2]}

1 Like

Try something like this:

torch.utils.data.DataLoader(
     dataset, batch_size=8, shuffle=True, num_workers=1,
     collate_fn=collate_fn
)

where collate_fn is:

def collate_fn(batch):
    return list(zip(*batch))