Question about custom dataset / loader and batching

Hi,

I’m building a custom dataset to read in Pascal VOC data (jpg image and xml annotations). The basic flow seems to work but I’m stuck trying to use a dataloader that will do the batching correctly. I’m not using a custom sampler, and am wondering if that is the problem.

My dataset getitem() looks like this:

    def __getitem__(self, item):
        name = self.image_names[item]
        img_file = os.path.join(self.path_jpg, name + '.jpg')
        xml_file = os.path.join(self.path_xml, name + '.xml')

        image = Image.open(img_file)
        orig_size_x = image.size[0]
        orig_size_y = image.size[1]

        boxes = self.MakeDict(xml_file)

        if self.output_size:
            image_trans = transforms.Resize((self.output_size, self.output_size))(image)
            boxes_trans = self.ResizeBoxes(orig_size_x, orig_size_y, boxes)
        else:
            image_trans = image
            boxes_trans = boxes

        image_trans = transforms.ToTensor()(image_trans)

        sample = {'name': name, 'image': image_trans,
                  'orig_size': (orig_size_y, orig_size_x),  # transpose required
                  'boxes': boxes_trans}

        return sample

which returns a dict that contains a dict boxes = {...} of annotations.

Then I use

dataloader = DataLoader(voc_dataset,
                        batch_size=1,
                        shuffle=True,
                        num_workers=0)


for i_batch, sample_batch in enumerate(dataloader):
    if i_batch == 2:
        break

    print(i_batch, sample_batch['image'].size(),
          sample_batch['boxes'])

and that works great for batch_size=1:

0 torch.Size([1, 3, 224, 224]) {0: [('chair',), [tensor([ 36]), tensor([ 140])], [tensor([ 17]), tensor([ 80])]], 1: [('chair',), [tensor([ 16]), tensor([ 119])], [tensor([ 26]), tensor([ 64])]], 2: [('chair',), [tensor([ 5]), tensor([ 121])], [tensor([ 10]), tensor([ 63])]], 3: [('sofa',), [tensor([ 154]), tensor([ 163])], [tensor([ 140]), tensor([ 121])]]}
1 torch.Size([1, 3, 224, 224]) {0: [('pottedplant',), [tensor([ 110]), tensor([ 170])], [tensor([ 17]), tensor([ 64])]], 1: [('person',), [tensor([ 93]), tensor([ 171])], [tensor([ 31]), tensor([ 106])]]}

but for other batch sizes it crashes:

...
return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
KeyError: 1

Now, if I get lucky and the number of items in my boxes dict equals the batch size then it works:

0 torch.Size([2, 3, 224, 224]) {0: [('cat', 'person'), [tensor([ 137,   78]), tensor([ 120,  154])], [tensor([ 174,  155]), tensor([ 148,  140])]]}
1 torch.Size([2, 3, 224, 224]) {0: [('person', 'tvmonitor'), [tensor([ 153,   67]), tensor([ 142,   85])], [tensor([ 25,  39]), tensor([ 47,  55])]], 1: [('person', 'tvmonitor'), [tensor([ 163,  116]), tensor([ 132,   70])], [tensor([ 14,  33]), tensor([ 40,  46])]], 2: [('boat', 'pottedplant'), [tensor([  82,  186]), tensor([ 104,   84])], [tensor([ 162,   54]), tensor([ 206,   37])]]}

Clearly my batching and my dictionaries are getting crossed (I’m clearly doing something not right). So I’m wondering if I have to define the way the batch is sampled or loaded?

Any help appreciated.

Thanks!

As a first shot, I think it’s a problem that boxes is an instance of collections.Mapping and thus default_collate will be called recursively.
Have a look at this line of code.

Could you wrap boxes in a list and try it again?

I see. Couldn’t find a way to wrap it in a way that would work (tried a list, still similar behavior). I thought more about this structure and decided to just put it in a tensor, eventually that worked well.

Thanks.