Hello guys and grateful to see such a helpful community
I’m trying to load my custom data-set via Pytorch, where in fact the training data set is arranged in the main folder “train” and contain two subfolders namely “Input” and “GT” subfolders. I want to make a regression NN model, where each image in the “Input” subfolder is corresponding to 10 images in the “GT” subfolder, thus the result of this model is 10 images, for example, the first image in …\train\input\image_1.bmp, has …\train\GT\image_1_1.bmp, image_1_2.bmp,…, image_1_10.bmp. The second image in …\train\input\image_2.bmp, has …\train\GT\image_2_1.bmp, image_2_2.bmp,…, image_2_10.bmp, and so on. Any ideas on the implementation?
I would suggest to write a custom Dataset
and implement the logic to load the corresponding 10 label images in the __getitem__(self, index)
method.
To enable lazy loading, you should pass the file paths in __init__
and only load the current sample images in __getitem__
.
Here is some pseudo code:
class MyDataset(Dataset):
def __init__(self, data_paths, label_paths, transform=None, target_transform=None):
self.data_paths = data_paths # Could be a list: ['./train/input/image_1.bmp', './train/input/image_2.bmp', ...]
self.label_paths = label_paths # Could be a nested list: [['./train/GT/image_1_1.bmp', './train/GT/image_1_2.bmp', ...], ['./train/GT/image_2_1.bmp', './train/GT/image_2_2.bmp', ...]]
self.transform = transforms
self.target_transform = target_transform
def __getitem__(self, index):
x = Image.open(self.data_paths[index])
if self.transform:
x = self.transform(x)
ys = []
for label_path in self.label_paths[index]:
y = Image.open(label_path)
if self.target_transform:
y = self.target_transform(y)
ys.append(y)
return x, ys
def __len__(self):
return len(self.data_paths)
Let me know, if you get stuck somewhere.
2 Likes
Thanks for your replay, your code works just fine .