Loading images with different bouding boxes equal size error

Stussy · March 5, 2024, 4:55am

I am trying to create a dataloader for my dataset. Each image has a certain number of cars and a bounding box for each of them, not all images have the same amount of bounding boxes.

You probably wont be able to run it, but here is some info.
This is my data loader

class AGR_Dataset(Dataset):
    def __init__(self, annotations_root, img_root, transform=None):
        """
        Arguments:
            annotations_root (string): Path to the csv file with annotations.
            img_root (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.annotations_root = annotations_root
        self.img_root = img_root
        self.transform = transform

    def __len__(self):
        return len(self.annotations_root)
    
    def __getitem__(self, idx):
        # idx may be the index or image name, I think image naem
        if torch.is_tensor(idx):
            idx = idx.tolist()
        
        idx_name = os.listdir(self.img_root)[idx]
        # print(idx_name)
        
        img_name = os.path.join(self.img_root, idx_name)
        annotation_data = os.path.join(self.annotations_root, f"{idx_name.removesuffix('.jpg')}.txt")
        # print(img_name, annotation_data)

        image = io.imread(img_name)

        with open(annotation_data, 'r') as file:
            lines = file.readlines()
            img_data = []
            img_labels = []
            for line in lines:
                line = line.split(',')
                line = [i.strip() for i in line]
                line = [float(num) for num in line[0].split()]
                img_labels.append(int(line[0]))
                img_data.append(line[1:])

        boxes = tv_tensors.BoundingBoxes(img_data, format='CXCYWH', canvas_size=(image.shape[0], image.shape[1]))

        # sample = {'image': image, 'bbox': boxes, 'labels': img_labels}
        sample = {'image': image, 'bbox': boxes}

        if self.transform:
            sample = self.transform(sample)

        print(sample['image'].shape)
        print(sample['bbox'].shape)
        # print(sample['labels'].shape)
        return sample

I run my transforms and create the dataloader

data_transform = v2.Compose([
    v2.ToImage(),
    # v2.Resize(680),
    v2.RandomResizedCrop(size=(680, 680), antialias=True),
    # v2.ToDtype(torch.float32, scale=True),
    v2.ToTensor()
])

transformed_dataset = AGR_Dataset(f'{annotations_path}/test/', 
                        f'{img_path}/test/',
                        transform=data_transform)

dataloader = DataLoader(transformed_dataset, batch_size=2,
                        shuffle=False, num_workers=0)

Then I am trying to iterate through it with this

for i, sample in enumerate(dataloader):
    print(i, sample)
    print(i, sample['image'].size(), sample['bbox'].size())

    if i == 4:
        break

With a batch size of 1, it runs properly, with a batch size of 2, I get this error

torch.Size([3, 680, 680])
torch.Size([12, 4])

torch.Size([3, 680, 680])
torch.Size([259, 4])

RuntimeError: stack expects each tensor to be equal size, but got [12, 4] at entry 0 and [259, 4] at entry 1

I believe it is due to the number of bounding boxes not being equal, but how do I overcome this?

This is my first post here, let me know if I need any more data or explanation.

Stussy · March 7, 2024, 2:57am

Everyone, the solution was found, it is linked here https://discuss.pytorch.org/t/dataloader-collate-fn-throws-runtimeerror-stack-expects-each-tensor-to-be-equal-size-in-response-to-variable-number-of-bounding-boxes/117952

Ushavilash · March 7, 2024, 7:35am

Ohh that’s sounds good that you got you solution.