How Do I Make a Custom Dataset for Image Segmentation?

Max_Smith · July 16, 2020, 10:11pm

Hi, spent the last couple hours researching this to no avail. I’m trying to import both my testing images and my overlay masks into pytorch as one dataset.

My folders look like this

All testing images-
-Video1
-image1.png
-image2.png
-image3.png
-Video2
-image1.png
-image2.png
-image3.png
-Video3
-image1.png
-image2.png
-image3.png
-Video4
-image1.png
-image2.png
-image3.png

All labels-
-Video1
-image1.png
-image2.png
-image3.png
-Video2
-image1.png
-image2.png
-image3.png
-Video3
-image1.png
-image2.png
-image3.png
-Video4
-image1.png
-image2.png
-image3.png

How can I make a dataset? I already read the offical pytorch tutorial but that one uses a csv file as the labels but this time I have actual images as labels. Please help thanks.

Alex_Dima · July 17, 2020, 12:31am

Hello!I am quite new in pytorch and recently had to deal with sth like that. I am not sure if I understand your issue very well, but this is how I made my dataset:

#get all image and mask paths
image_paths = glob.glob("C:\\Users\\...\\data\\img\\*.jpg")
mask_paths = glob.glob("C:\\Users\\...\\data\\msk\\*.jpg")

#split paths
len_images = len(image_paths)
print(len_images)
train_size = 0.6

train_image_paths = image_paths [:int(len_images*train_size)]
test_image_paths = image_paths[int(len_images*train_size):]

train_mask_paths = mask_paths [:int(len_images*train_size)]
test_mask_paths = mask_paths[int(len_images*train_size):]

#custom dataset class

class image_dataset(Dataset):
    def __init__(self, images, masks, train=True):
        self.images = images
        self.masks = masks
      
    def __getitem__(self, idx):
        image = Image.open(self.images[idx])
        mask = Image.open(self.masks[idx])        
        return image,mask
        
    def __len__(self):
        return len(self.images)

In my case I only have the pictures and masks in one folder each, but I suppose there is a workaround for the subfolders. I hope this will be of help and not more confusing!Good luck!