Detectron2 test loader, error

I am following the Detectron/ Detic tutorial in order to auto crop a large training dataset from the Kaggle Whale and dolphin competition.

Can anyone help out with the index error I am getting? It seems like Detectron tries to index a list with the dict key.

The config cfg, vocabulary etc. are the same as in the tutorial.

Here is a snippet of my code, unfortunately not replicable without the actual image etc, hopefully informative enough to make sense.

from detectron2.modeling import build_model
from detectron2.data import build_detection_test_loader
import torch

def my_dataset_generator():
    data_list = []
    data_dict = {}
    for index, observation in df.iterrows():
        data_dict["file_name"] = observation['file_path']
        data_dict['image_id'] = observation['image']
        data_list.append(data_dict)
        data_dict = {}
        
    return data_list

model = build_model(cfg)  # returns a torch.nn.Module
test_loader = build_detection_test_loader(cfg, 'my_whales2')

which returns

[02/28 12:46:55 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[02/28 12:46:55 d2.data.common]: Serializing 51033 elements to byte tensors and concatenating them all ...
[02/28 12:46:55 d2.data.common]: Serialized dataset takes 6.42 MiB

Printing a sample of the test loader look like this:

a = next(iter(test_loader))
a
[{'file_name': '/home/jupyter/happy-whale-and-dolphin/train_images/00021adfb725ed.jpg',
  'image_id': '00021adfb725ed.jpg',
  'width': 804,
  'height': 671,
  'image': tensor([[[ 74,  74,  73,  ...,  71,  72,  73],
           [ 76,  76,  77,  ...,  63,  64,  65],
           [ 80,  81,  83,  ...,  60,  60,  61],
           ...,
           [ 44,  42,  43,  ...,  50,  51,  51],
           [ 43,  41,  42,  ...,  49,  50,  50],
           [ 43,  41,  41,  ...,  50,  50,  50]],
  
          [[120, 120, 119,  ..., 113, 114, 115],
           [122, 122, 123,  ..., 104, 104, 105],
           [126, 127, 130,  ...,  99,  99, 100],
           ...,
           [ 45,  43,  43,  ...,  48,  49,  50],
           [ 46,  44,  43,  ...,  49,  50,  50],
           [ 45,  43,  43,  ...,  50,  50,  50]],
  
          [[170, 170, 169,  ..., 163, 164, 165],
           [172, 172, 173,  ..., 155, 155, 156],
           [175, 177, 178,  ..., 149, 149, 150],
           ...,
           [ 54,  53,  53,  ...,  58,  59,  60],
           [ 55,  53,  53,  ...,  57,  58,  58],
           [ 57,  55,  55,  ...,  58,  58,  58]]], dtype=torch.uint8)}]

I try to do inference on that data loader

model.eval()
with torch.no_grad():
    outputs = model(test_loader)

which renders this stack trace:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_14737/2278559883.py in <module>
      1 model.eval()
      2 with torch.no_grad():
----> 3     outputs = model(test_loader)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/Detic/detic/modeling/meta_arch/custom_rcnn.py in forward(self, batched_inputs)
    113         """
    114         if not self.training:
--> 115             return self.inference(batched_inputs)
    116 
    117         images = self.preprocess_image(batched_inputs)

~/Detic/detic/modeling/meta_arch/custom_rcnn.py in inference(self, batched_inputs, detected_instances, do_postprocess)
     94         assert detected_instances is None
     95 
---> 96         images = self.preprocess_image(batched_inputs)
     97         features = self.backbone(images.tensor)
     98         proposals, _ = self.proposal_generator(images, features, None)

/opt/conda/lib/python3.7/site-packages/detectron2/modeling/meta_arch/rcnn.py in preprocess_image(self, batched_inputs)
    222         Normalize, pad and batch the input images.
    223         """
--> 224         images = [x["image"].to(self.device) for x in batched_inputs]
    225         images = [(x - self.pixel_mean) / self.pixel_std for x in images]
    226         images = ImageList.from_tensors(images, self.backbone.size_divisibility)

/opt/conda/lib/python3.7/site-packages/detectron2/modeling/meta_arch/rcnn.py in <listcomp>(.0)
    222         Normalize, pad and batch the input images.
    223         """
--> 224         images = [x["image"].to(self.device) for x in batched_inputs]
    225         images = [(x - self.pixel_mean) / self.pixel_std for x in images]
    226         images = ImageList.from_tensors(images, self.backbone.size_divisibility)

TypeError: list indices must be integers or slices, not str

Here is the function that throws

    def preprocess_image(self, batched_inputs: List[Dict[str, torch.Tensor]]):
        """
        Normalize, pad and batch the input images.
        """
        images = [x["image"].to(self.device) for x in batched_inputs]
        images = [(x - self.pixel_mean) / self.pixel_std for x in images]
        images = ImageList.from_tensors(images, self.backbone.size_divisibility)
        return images

Did I set up, or call, my data loader improperly? The stack trace would make sense if each element of the batched_inputs is a dict wrapped in a list, right?

Can you try checking the type of the elements within batched_inputs in the function preprocess_image?

The error seems to be saying that x in this line

images = [x["image"].to(self.device) for x in batched_inputs]

is a list rather than a dict that you are expecting.

@nivek I agree. The output of

data_iter = iter(test_loader)
for x in data_iter:
    print(type(x))

is

<class 'list'>
<class 'list'>
<class 'list'>
(...)
<class 'list'>
<class 'list'>

From the Detectron docs regarding models and data loaders:

The output of the default DatasetMapper is a dict that follows the above format. After the data loader performs batching, it becomes list[dict] which the builtin models support.

My data loader generates list[list[dict]] it seems, not sure why.

I don’t know how my_dataset_generator is being used since its usage is excluded from your code snippet.

My best guess is that the function my_dataset_generator is returning list[dict]. Subsequently, build_detection_test_loader returns a DataLoader with a batch_size>1. So when batching happens in the DataLoader, you will get list[list[dict]]].

Sorry, the my_dataset_generator() function is called by

from detectron2.data import DatasetCatalog
DatasetCatalog.register("my_whales2", my_dataset_generator)

I forgot to include that line in my original post. From the Detectron docs:

func (callable ) – a callable which takes no arguments and returns a list of dicts. It must return the same results if called multiple times

Regarding the data loader, it accepts a

dataset – a list of dataset dicts, or a pytorch dataset (either map-style or iterable)…

We agree that the data loader returns the wrong format, but I fail to understand what I am doing wrong according to the documentation.

If I should pass the data dicts in another format to the data loader, what would that be?

The DataLoader processes each batch with the collate_fn, you can read more about it here.

By default, the default_collate function is used. It is hard to debug why default_collate is not working as your case without look at the exact samples in your data.

I think the easiest way is for you to pass in a custom collate_fn into build_detection_test_loader, your custom function should take a batch of samples and collate them (map list[list[dict]] to list[dict]). That should solve the issue for you.

Alternatively, you can try to figure out why default_collate is not working for your inputs (I think this part is the most relevant for your input), and modify your inputs accordingly before passing them into the DataLoader.

1 Like

I have made some progress, thanks to your clues. It turned out that if I copy-pasted default_collate into my notebook, I got a useful traceback about missing torch.stack dimension match. This seemed to caused by only the short edge of the images being resized.

After calling a custom DataLoader, resizing all edges of the images, the procedure runs longer, but produces another error. Puzzling…