"Sizes of tensors must match except in dimension 0." for batch size > 1. Tensors are the same shape


I have a custom dataset class. In the getitem() function it returns a tensor with shape [3,300,300] and a target dictionary. When I initialize the dataloder with batch size of 1 and iterate over this dataloader, I’m able to print the shapes of all tensors and have confirmed this.

When I increase the batch size > 1, I run into the following error when trying to iterate over the dataloader:

Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 79, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 79, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate
    return {key: default_collate([d[key] for d in batch]) for key in elem}
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in <dictcomp>
    return {key: default_collate([d[key] for d in batch]) for key in elem}
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 64, in default_collate
    return default_collate([torch.as_tensor(b) for b in batch])
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 2 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:612

I’m stumped. My tensors are the same shape, so why can’t they be stacked?

Edit - I believe it might be an issue with the target. When __getitem__ returns image, {}, this issue does not appear. Here is my get item code:

    def __getitem__(self, index):
      image_info = self.data[index]
      image = self._read_image(image_info['image_id'])
      # duplicate boxes to prevent corruption of dataset
      boxes = copy.copy(image_info['boxes'])
      # duplicate labels to prevent corruption of dataset
      labels = copy.copy(image_info['labels'])
      # Create target
      target = {}
      target['boxes'] = boxes
      target['labels'] = labels 
      target['image_id'] = image_info['image_id']
      # Perform transformations that model expects.
      if self.transforms:
        image = self.transforms(image)
      return image, target

Edit2 - It looks like the issue is that some images have two annotations where others have one and this is why stacking does not work. Below I printed the two targets that might be incompatible for stacking:

{'boxes': array([[0.288125, 0.561205, 0.600625, 0.952919],
       [0.285   , 0.464218, 0.655625, 0.702448]], dtype=float32), 'labels': array([1, 1]), 'image_id': '6c3ba7b8844e1ab5'}
{'boxes': array([[0.065   , 0.405833, 0.90375 , 1.      ]], dtype=float32), 'labels': array([1]), 'image_id': '4559641996704238'}

Any ideas how to resolve this?

make the class not to stack that and you will get a list of elements

Thanks for the reply. I’m confused. I’m not stacking, torch is calling stack so that it can handle batches.

yep because it inspects what you return and I think if you return lists (maybe dicts) it inspects inside and if there are tensors it will just try to stack them.

You can try to return dictionaries (dunno if it inspect that) and if it does fail you can wrap everything in a dummy object.

As another option you can write your own collate function
(it’s callable which carries out this batching, just replacing the default one=)

For example, if you return strings, as pytorch cannot “stack” the, it does return lists of strings as a batch.

I see now! I guess the collate function was the missing piece in my understanding, thank you.

This the definintion of my data set but I got the same error. even I resize all image into the same size 224:

 image_transforms = {
    'train': transforms.Compose([

    'valid': transforms.Compose([

Could you pass the size argument to Resize as a sequence, since an int value would only resize the smaller edge to this shape in case the inputs are not quadratic.
From the docs:

size (sequence or int) – Desired output size. If size is a sequence like (h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). In torchscript mode padding as single int is not supported, use a tuple or list of length 1: [size, ] .

1 Like