Upload a customize data set for multi-regression task

Firas · April 24, 2019, 5:47am

Hello guys and grateful to see such a helpful community
I’m trying to load my custom data-set via Pytorch, where in fact the training data set is arranged in the main folder “train” and contain two subfolders namely “Input” and “GT” subfolders. I want to make a regression NN model, where each image in the “Input” subfolder is corresponding to 10 images in the “GT” subfolder, thus the result of this model is 10 images, for example, the first image in …\train\input\image_1.bmp, has …\train\GT\image_1_1.bmp, image_1_2.bmp,…, image_1_10.bmp. The second image in …\train\input\image_2.bmp, has …\train\GT\image_2_1.bmp, image_2_2.bmp,…, image_2_10.bmp, and so on. Any ideas on the implementation?

ptrblck · April 24, 2019, 11:14am

I would suggest to write a custom Dataset and implement the logic to load the corresponding 10 label images in the __getitem__(self, index) method.
To enable lazy loading, you should pass the file paths in __init__ and only load the current sample images in __getitem__.
Here is some pseudo code:

class MyDataset(Dataset):
    def __init__(self, data_paths, label_paths, transform=None, target_transform=None):
        self.data_paths = data_paths  # Could be a list: ['./train/input/image_1.bmp', './train/input/image_2.bmp', ...]
        self.label_paths = label_paths  # Could be a nested list: [['./train/GT/image_1_1.bmp', './train/GT/image_1_2.bmp', ...], ['./train/GT/image_2_1.bmp', './train/GT/image_2_2.bmp', ...]]
        self.transform = transforms
        self.target_transform = target_transform
        
    def __getitem__(self, index):
        x = Image.open(self.data_paths[index])
        if self.transform:
            x = self.transform(x)

        ys = []
        for label_path in self.label_paths[index]:
            y = Image.open(label_path)
            if self.target_transform:
                y = self.target_transform(y)
            ys.append(y)

        return x, ys

    def __len__(self):
        return len(self.data_paths)

Let me know, if you get stuck somewhere.

Firas · April 25, 2019, 9:34am

Thanks for your replay, your code works just fine .