(image, mask) pair do not correspond to each other

I am writing a simple custom DataLoader (which I will add more features to later) for a segmentation dataset but the (image, mask) pair I return using __getitem()__ method are different; the returned mask belongs to a different image than the one which is returned. My directory structure is /home/bohare/data/images and /home/bohare/data/masks .

Following is the code I have:

import torch
from torch.utils.data.dataset import Dataset
from PIL import Image
import glob
import os
import matplotlib.pyplot as plt

class CustomDataset(Dataset):
    def __init__(self, folder_path):
        
        self.img_files = glob.glob(os.path.join(folder_path,'images','*.png'))
        self.mask_files = glob.glob(os.path.join(folder_path,'masks','*.png'))
    
    def __getitem__(self, index):
        
        image = Image.open(self.img_files[index])
        mask = Image.open(self.mask_files[index])
        
        return image, mask
    
    def __len__(self):
        return len(self.img_files)
data = CustomDataset(folder_path = '/home/bohare/data')
len(data)

This code correctly gives out the total size of the dataset.

But when I use:
img, msk = data.__getitem__(n) where n is the index of any (image, mask) pair and I plot the image and mask, they do not correspond to one another.

How can I modify/what can I add to the code to make sure the (image, mask) pair are returned correctly? Thanks for the help.

I found the solution thanks to the user David S. Apparently, glob.glob returns the list in arbitrary order so I put sorted in front of both images and masks, like so:

self.img_files = sorted(glob.glob(os.path.join(folder_path,'images','*.png')))
self.mask_files = sorted(glob.glob(os.path.join(folder_path,'masks','*.png')))

And now it works.

I’m running into a similar issue, except I still see images + masks that don’t correspond

i.e.

image_path:  /Brain_Tumor_2D_Dataset/Meningioma/images/Meningioma_159.tif
mask_path:  /Brain_Tumor_2D_Dataset/Meningioma/masks/Meningioma_160_mask.tif

instead of

/Brain_Tumor_2D_Dataset/Glioblastoma/images/Glioblastoma_t2_010.tif
/Brain_Tumor_2D_Dataset/Glioblastoma/masks/Glioblastoma_t2_010_mask.tif