Loading data for multilabel classification, too large for memory

Robin_Muller · August 29, 2022, 9:40am

I have one folder containing ~30k images. There are 19 classes, and an image can belong to n of them (multilabel classification). Therefore, i cant use the ImageFolder function. I also need the name of each image-file, since the filename contains the ID of the image and there is a corresponding Dataframe including the classes of each ID. What is the most efficient way loading this data?

Robin_Muller · August 29, 2022, 10:23am

I’ve found a solution by implementing your own version of torch.utils.data.Dataset. A detailed tutorial here