DataLoader gives:stack expects each tensor to be equal size,due to different image has different objects number

Hello,I am preparing dataset for a detection task:

def __getitem__(self, idx):
    ......
   return from_numpy(image), from_numpy(bboxes)

As you can see,the second element of the return is bboxes,and different image will have different number of objects in it,so the shape of the bboxes will vary from each other,and this will cause Dataloder throws out an exception like:

RuntimeError: stack expects each tensor to be equal size, but got [1, 1, 5] at entry 0 and [1, 5, 5] at entry 1

when it tries to stack bboxes(label) in a batch,question is

what is general paradigm to handle this issue?

1 Like

This discussion may be helpful.

Usually, in such situations some sort of padding has to be introduced for each batch so all elements match in size. This can be achieved by defining your own collate_fn that is then passed to DataLoader as an argument.

Thanks,I may haven’t express may question clearly,the size(width/height) of every transformed image is same,and the problem caused from that the numbers of object in each image in the same batch are different,not the height or width of images

1 Like

The simplest solution to that is to define the collate_fn like this:

def collate_fn(batch):
    return tuple(zip(*batch))

And this is the for loop:

for image_batch, bbox_batch in dataloader: 
    ...

Then, inside the loop you can stack the images before passing them to the model, etc.

If you need some additional functionality, you can even define your own object for handling batches (in the docs there is an example how to do this).

6 Likes

I don’t get it. I have the same problem like OP had. I am loading data like this:

my_loader = DataLoader(my_dataset, batch_size=128, shuffle=True)

How should I use that collate_fn function (what is the “batch” argument? )? What should I do in the for loop?

By default, Dataloader tries to stack the tensors to form a batch (calls torch.stack on the current batch), but it fails if the tensors are not of equal size. With the collate_fn it is possible to override this behavior and define your own “stacking procedure”. In the example above, the batch arg contains a list of instances (an image-bbox pair; the batch arg is of type List[Tuple[Image, Bbox]]) and with tuple(zip(*batch)) we form a batch, where batch[0] corresponds to the images, and batch[1] to the bboxes in the batch.

Ok, so I have a function:

def collate_fn(data):
    img, bbox = data
    zipped = zip(img, bbox)
    return zipped

Where data is object of class based on Dataset like:


def getitem(self, idx):
img = self.imgs[idx]
bbox = self.bboxs[idx]
return (
img,
bbox,
)

And how I am supposed to use it?
When I do like this:

my_loader = DataLoader(data, batch_size=8, shuffle=True, collate_fn=collate_fn(data))

there is an error that: zip object is not callable

What am I doing wrong?

First consume the iterator returned by zip in the collate function:

def collate_fn(data):
    img, bbox = data
    zipped = zip(img, bbox)
    return list(zipped)

The error is self-explantory. collate_fn has to be callable or in layman’s terms a function.

my_loader = DataLoader(data, batch_size=8, shuffle=True, collate_fn=collate_fn)
1 Like

I had same problem , but my targets were dict (bounding box in detection) and solved with a similar function :

def collate_fn(data):
    zipped = zip(data)
    return list(zipped)