DataLoader gives:stack expects each tensor to be equal size,due to different image has different objects number

AlexLuya · August 7, 2020, 9:43am

Hello,I am preparing dataset for a detection task:

def __getitem__(self, idx):
    ......
   return from_numpy(image), from_numpy(bboxes)

As you can see,the second element of the return is bboxes,and different image will have different number of objects in it,so the shape of the bboxes will vary from each other,and this will cause Dataloder throws out an exception like:

RuntimeError: stack expects each tensor to be equal size, but got [1, 1, 5] at entry 0 and [1, 5, 5] at entry 1

when it tries to stack bboxes(label) in a batch,question is

what is general paradigm to handle this issue?

mariosasko · August 7, 2020, 10:33am

This discussion may be helpful.

Usually, in such situations some sort of padding has to be introduced for each batch so all elements match in size. This can be achieved by defining your own collate_fn that is then passed to DataLoader as an argument.

AlexLuya · August 7, 2020, 2:49pm

Thanks,I may haven’t express may question clearly,the size(width/height) of every transformed image is same,and the problem caused from that the numbers of object in each image in the same batch are different,not the height or width of images

mariosasko · August 7, 2020, 5:40pm

The simplest solution to that is to define the collate_fn like this:

def collate_fn(batch):
    return tuple(zip(*batch))

And this is the for loop:

for image_batch, bbox_batch in dataloader: 
    ...

Then, inside the loop you can stack the images before passing them to the model, etc.

If you need some additional functionality, you can even define your own object for handling batches (in the docs there is an example how to do this).

Tomash · November 4, 2020, 5:57pm

I don’t get it. I have the same problem like OP had. I am loading data like this:

my_loader = DataLoader(my_dataset, batch_size=128, shuffle=True)

How should I use that collate_fn function (what is the “batch” argument? )? What should I do in the for loop?

mariosasko · November 5, 2020, 2:39am

By default, Dataloader tries to stack the tensors to form a batch (calls torch.stack on the current batch), but it fails if the tensors are not of equal size. With the collate_fn it is possible to override this behavior and define your own “stacking procedure”. In the example above, the batch arg contains a list of instances (an image-bbox pair; the batch arg is of type List[Tuple[Image, Bbox]]) and with tuple(zip(*batch)) we form a batch, where batch[0] corresponds to the images, and batch[1] to the bboxes in the batch.

Tomash · November 5, 2020, 10:17pm

Ok, so I have a function:

def collate_fn(data):
    img, bbox = data
    zipped = zip(img, bbox)
    return zipped

Where data is object of class based on Dataset like:

…
def getitem(self, idx):
img = self.imgs[idx]
bbox = self.bboxs[idx]
return (
img,
bbox,
)

And how I am supposed to use it?
When I do like this:

my_loader = DataLoader(data, batch_size=8, shuffle=True, collate_fn=collate_fn(data))

there is an error that: zip object is not callable

What am I doing wrong?

mariosasko · November 6, 2020, 2:11am

First consume the iterator returned by zip in the collate function:

def collate_fn(data):
    img, bbox = data
    zipped = zip(img, bbox)
    return list(zipped)

The error is self-explantory. collate_fn has to be callable or in layman’s terms a function.

my_loader = DataLoader(data, batch_size=8, shuffle=True, collate_fn=collate_fn)

Mahdi_Farhadi · July 3, 2022, 2:08pm

I had same problem , but my targets were dict (bounding box in detection) and solved with a similar function :

def collate_fn(data):
    zipped = zip(data)
    return list(zipped)

iliasslasri · March 17, 2025, 10:23am

but what’s the default behaviour, does the DataLoader just skip the element ?