Using ImageFolder without subfolders/labels

I am working on a image classification project right now. I have a train dataset consists of 102165 png files of different instruments.
微信图片_20200122233631
I have only a train folder which contains all the image files as the above screenshot shows. There are total 10 classes and it all starts with xxxxx_acoustics_xxxxxxxxxxxx. I already wrote a script to extract the names and manages to get all 10 class labels.
My questions is that is there a way to use ImageFolder to load these data ? I know that ImageFolder requires sub-folders, but since I have such amount images, I think move them into different sub-folders would be very time-consuming.

Can anyone points out what can I try ? Thanks in advance!

I

Since you already have a method to extract the labels, I would suggest to write a custom Dataset and load each sample there.
Something like this could be a starter:

class MyDataset(Dataset):
    def __init__(self, image_paths, transform=None):
        self.image_paths = image_paths
        self.transform = transform
        
    def get_class_label(self, image_name):
        # your method here
        y = ...
        return y
        
    def __getitem__(self, index):
        image_path = self.image_paths[index]
        x = Image.open(image_path)
        y = self.get_class_label(image_path.split('/')[-1])
        if self.transform is not None:
            x = self.transform(x)
        return x, y
    
    def __len__(self):
        return len(self.image_paths)
1 Like