DataLoader gives:stack expects each tensor to be equal size,due to different image has different objects number

Hello,I am preparing dataset for a detection task:

def __getitem__(self, idx):
    ......
   return from_numpy(image), from_numpy(bboxes)

As you can see,the second element of the return is bboxes,and different image will have different number of objects in it,so the shape of the bboxes will vary from each other,and this will cause Dataloder throws out an exception like:

RuntimeError: stack expects each tensor to be equal size, but got [1, 1, 5] at entry 0 and [1, 5, 5] at entry 1

when it tries to stack bboxes(label) in a batch,question is

what is general paradigm to handle this issue?

This discussion may be helpful.


Usually, in such situations some sort of padding has to be introduced for each batch so all elements match in size. This can be achieved by defining your own collate_fn that is then passed to DataLoader as an argument.

Thanks,I may haven’t express may question clearly,the size(width/height) of every transformed image is same,and the problem caused from that the numbers of object in each image in the same batch are different,not the height or width of images

The simplest solution to that is to define the collate_fn like this:

def collate_fn(batch):
    return tuple(zip(*batch))

And this is the for loop:

for image_batch, bbox_batch in dataloader: 
    ...

Then, inside the loop you can stack the images before passing them to the model, etc.

If you need some additional functionality, you can even define your own object for handling batches (in the docs there is an example how to do this).