Load images from two folders

i am trying to load my training image data stored in two folders named “data/train1” and “data/train2”. how could i do that? i need some coding help. thanks.

Creating a custom Dataset as described here would be the best approach.

I have reviewed this link a couple of times. It’s fairly confusing.

Could you describe your confusion a bit more and where you are stuck?

I have a dataframe that stores “image_path” and “image_class”. Images are stored in my local folders in .JPG format. I tried the exact same code (with customizations) provided in the example. It didn’t work.

The error message says I need to use “transforms.ToPILImage()” to convert it to PIL image (TypeError: pic should be PIL Image or ndarray. Got <class ‘dict’>). Then I converted the image to PIL format another error message raised says "TypeError: pic should be Tensor or ndarray. Got <class ‘dict’>.
". That made me feel this tutorial is not useful and a little big misleading. Why is it using skimage io to open images?

It was an “easy” request: read file path and image class from a dataframe as a Dataset, and then get that Dataset into a Dataloader. Following that tutorial doesn’t help.

The tutorial explains the Dataset class and the required steps you would need to implement to write a custom Dataset:

  • load data, paths, set transformations etc. in its __init__ method
  • load and process a single sample in the __getitem__ using the passed index
  • return the length of the dataset (number of samples) in its __len__ method.

You don’t need to follow the tutorial step by step and can of course pick any library to load the data sample which you like. If you don’t like skimage, use PIL. If PIL doesn’t work, use OpenCV etc.

Such a nice and clear answer. I made it to work (in less than 2 minutes). That tutorial is confusing. It’s been a few days of frustration.

from PIL import Image

class my_data(Dataset):
    def __init__(self, df, transform = None):
        
        self.img_path = df.iloc[:,2].tolist() # image file path at location 2 of a dataframe
        self.img_label = df.iloc[:,3].tolist() # image label at location 2 of a dataframe
        self.transform = transform
        self.class2index = class_to_idx
        
        if transform==None:
            self.transform = transforms.Compose([
                transforms.Resize((224,224)),
                transforms.ToTensor(),
            ])

    def __len__(self):
        return len(self.img_label)
    
    def __getitem__(self, index):

        filename = self.img_path[index]
        label = self.class2index[self.img_label[index]]
        image = Image.open(filename)
        image = self.transform(image)

        return image, label
1 Like