Data Loader for multiple classes with train / test subfolders

Struggling with PyTorch My folders are organized as such:

imgs
|
|_classA
  |_train
     |_img1.png
     |_img2.png
  |_test
     |_img20.png
     |_img21.png
|
|
|_classB
  |_train
  |_test
...

And I would like for each class (!) load data from train, train some network, evaluate on test.

RTFM got me so far:

  1. ImageLoader for loading of my .png
  2. Custom Dataset, e.g. to keep track which class my image is from
  3. DataLoader to get batches of my data and hand it over to my model

Now most of the tutorials have the structure

train
|
|
test

and the classes are within the train / test folders.

Now I am not sure, how to get started, due to train / test being sub-folders in my structure.

As you have already explained using the native ImageFolder dataset won’t work since your folder structure is different. You could of course rearrage it (via symlinks if you don’t want to move the actual data), but in case that’s not possible you could write a custom Dataset and check the make_dataset method which is traversing the folders to collect the sample paths and the corresponding targets.