Implementing a Keypoint Regression Dataset: Addressing Coordinate Format Errors

SimCan · April 14, 2024, 2:29pm

I would like to implement a dataset for the Keypoint Regression task. The images are single-channel (BW), while the coordinates of the key points are currently in the format of a list of lists: [[x0, y0], [x1, y1]]. This is the implementation of the dataset:

class PoseDataset(Dataset):
    def __init__(self, image_paths, annotations, transform=None):
        self.image_paths = image_paths
        self.annotations = annotations
        self.transform = transform

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        image_path = self.image_paths[idx]
        image = Image.open(image_path)

        k_x0 = self.annotations[idx][0]
        k_y0 = self.annotations[idx][1]
        k_x1 = self.annotations[idx][2]
        k_y1 = self.annotations[idx][3]
        
        keypoints = torch.tensor([[k_x0, k_y0], [k_x1, k_y1]]).float()

        if self.transform:
            image, keypoints = self.transform(image, keypoints)

        return image, keypoints

transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomVerticalFlip(),
    transforms.ToTensor(),
])

dataset = PoseDataset(image_paths, annotations, transform=transform)
dataloader = DataLoader(dataset, batch_size=BS, shuffle=True)

next(iter(dataloader))

This, instead, is the resulting error from the aforementioned code.

Cell In[10], line 31, in PoseDataset.__getitem__(self, idx)
     28 keypoints = [[k_x0, k_y0], [k_x1, k_y1]]
     30 if self.transform:
---> 31     image, keypoints = self.transform(image, keypoints)
     33 return image, keypoints

TypeError: Compose.__call__() takes 2 positional arguments but 3 were given

I would like to know why I am getting this error.

Is it perhaps due to the format of the coordinates?
If so, in what format should they be?