Collate Issue with Fast RCNN (tries to transform dictionary in GeneralizedRCNN)

maddyschiappa · November 26, 2019, 8:49pm

I am trying to create a custom dataset to pass into torch.utils.data.DataLoader for training: https://pytorch.org/docs/stable/torchvision/models.html#faster-r-cnn

The instructions indicate they want a list of images and a list of (list[Dict[Tensor]]) for the targets with ‘boxes’ and ‘labels’. Now I create this in my dataset getitem which returns the image as a tensor and a dictionary with keys “boxes” and “labels” and a list of 4 floats for “boxes” and an int for “labels”.

When I try training I run into the following issues:

  File "/home/mschiappa/Desktop/VisualRelationshipsDetection/run_exp.py", line 118, in run_model
    outputs = model(inputs, targets)
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/generalized_rcnn.py", line 47, in forward
    images, targets = self.transform(images, targets)
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/transform.py", line 36, in forward
    target = targets[i] if targets is not None else targets
KeyError: 0

and when I print what the targets look like:

{'boxes': tensor([[0.6547, 1.0000, 0.4703, 0.4938],
        [0.7802, 0.9403, 0.0472, 0.0796],
        [0.9897, 0.9823, 0.0000, 0.4879],
        [1.0000, 0.4943, 0.8626, 0.3522],
        [1.0000, 1.0000, 0.0000, 0.0000],
        [1.0000, 0.2750, 0.8687, 0.0000],
        [0.9956, 0.7544, 0.8100, 0.5774],
        [1.0000, 0.9531, 0.0000, 0.1137]]), 'labels': tensor([227, 445, 432, 307, 390, 281, 419, 333])}

When I look into the GeneralizedRCNN code which is inherited by the fastrcnn, it tries to transform the targets: https://github.com/pytorch/vision/blob/master/torchvision/models/detection/generalized_rcnn.py#L64
Which shouldn’t work since it is a dictionary, and it appears to be where it is failing.

I am unsure how to fix this without modifying the PyTorch fastrcnn code.

maddyschiappa · November 26, 2019, 9:19pm

Currently making progress with the fix:

# images, targets = self.transform(images, targets)
images, _ = self.transform(images)

to:

github.com

pytorch/vision/blob/master/torchvision/models/detection/generalized_rcnn.py#L64




"""
if self.training and targets is None:
    raise ValueError("In training mode, targets should be passed")
original_image_sizes = torch.jit.annotate(List[Tuple[int, int]], [])
for img in images:
    val = img.shape[-2:]
    assert len(val) == 2
    original_image_sizes.append((val[0], val[1]))


images, targets = self.transform(images, targets)
features = self.backbone(images.tensors)
if isinstance(features, torch.Tensor):
    features = OrderedDict([(0, features)])
proposals, proposal_losses = self.rpn(images, features, targets)
detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
detections = self.transform.postprocess(detections, images.image_sizes, original_image_sizes)


losses = {}
losses.update(detector_losses)
losses.update(proposal_losses)

ptrblck · November 27, 2019, 2:38am

Could you try to pass the target as a list of dicts, where each dict in the list corresponds to the boxes and labels for the element in the data list.

EDIT: This code should work:

model = models.detection.fasterrcnn_resnet50_fpn()

x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
y = [{'boxes': torch.tensor([[0.6547, 1.0000, 0.4703, 0.4938],
        [0.7802, 0.9403, 0.0472, 0.0796],
        [0.9897, 0.9823, 0.0000, 0.4879],
        [1.0000, 0.4943, 0.8626, 0.3522],
        [1.0000, 1.0000, 0.0000, 0.0000],
        [1.0000, 0.2750, 0.8687, 0.0000],
        [0.9956, 0.7544, 0.8100, 0.5774],
        [1.0000, 0.9531, 0.0000, 0.1137]]),
     'labels': torch.tensor([227, 445, 432, 307, 390, 281, 419, 333])},
    {'boxes': torch.tensor([[0.6547, 1.0000, 0.4703, 0.4938],
        [0.7802, 0.9403, 0.0472, 0.0796],
        [0.9897, 0.9823, 0.0000, 0.4879],
        [1.0000, 0.4943, 0.8626, 0.3522],
        [1.0000, 1.0000, 0.0000, 0.0000],
        [1.0000, 0.2750, 0.8687, 0.0000],
        [0.9956, 0.7544, 0.8100, 0.5774],
        [1.0000, 0.9531, 0.0000, 0.1137]]),
     'labels': torch.tensor([227, 445, 432, 307, 390, 281, 419, 333])}
]

output = model(x, y)

maddyschiappa · November 27, 2019, 4:48pm

Now getting this error:

ImportError: /usr/lib/python3/dist-packages/torchvision/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

maddyschiappa · November 27, 2019, 4:52pm

This leads to the same error as if I change the model:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/generalized_rcnn.py", line 52, in forward
    proposals, proposal_losses = self.rpn(images, features, targets)
  File "/home/mschiappa/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/rpn.py", line 411, in forward
    boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level)
  File "/usr/lib/python3/dist-packages/torchvision/models/detection/rpn.py", line 336, in filter_proposals
    keep = box_ops.batched_nms(boxes, scores, lvl, self.nms_thresh)
  File "/usr/lib/python3/dist-packages/torchvision/ops/boxes.py", line 72, in batched_nms
    keep = nms(boxes_for_nms, scores, iou_threshold)
  File "/usr/lib/python3/dist-packages/torchvision/ops/boxes.py", line 32, in nms
    _C = _lazy_import()
  File "/usr/lib/python3/dist-packages/torchvision/extension.py", line 12, in _lazy_import
    from torchvision import _C as C
ImportError: /usr/lib/python3/dist-packages/torchvision/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

also that is what I want to do but when I use the dataset class and make my own custom getitem and a I return the image and a dictionary, for some reason it collates them. How do I stop this so it returns the same format as you provided when using the DataLoader class to iterate through?

ptrblck · November 27, 2019, 7:54pm

Could you post the modification to your model, so that we have a baseline we could debug against, please?

maddyschiappa · November 27, 2019, 8:45pm

It won’t work because later code still requires it in that format. The summary is that I need to find out why the DataLoader keeps collating my batch into one big dictionary where the keys have a list of the batches values instead of a list of dictionaries, one per sample.

My getitem is returning image, mydict but when the DataLoader gets a batch it becomes [image1, image2], {key1: [target1, target2], key2: [target1, target2]}

Tudor_Marian_Surdoiu · May 8, 2020, 4:59pm

Try something like this:

torch.utils.data.DataLoader(
     dataset, batch_size=8, shuffle=True, num_workers=1,
     collate_fn=collate_fn
)

where collate_fn is:

def collate_fn(batch):
    return list(zip(*batch))