Iterating through ImageFolder for sample, target

John_J_Watson · May 21, 2020, 8:54am

I am trying to use the ImageFolder class to read a bunch of images which are arranged this way:

0(folder)
    image01
    image01

1(folder)
    image01
    image01

2(folder)
    image01
    image01

Then I do something lilke:

ds = torchvision.datasets.ImageFolder( root=path_to, transform=p )

I can see the list of classes through:

list_of_classes=list(map(int, list(ds.classes)) )

Now reading through the docs, I iterate through the sample and targets lke so:

for idx, (sample, target) in enumerate(ds):
    print(sample, target)

For some reason, I notice that the target is showing the indices of ds.classes rather than the target value itself.

Is this the intended behaviour? I was expecting it to simply show me the classes (0,1,2) in this case.

ptrblck · May 21, 2020, 9:49am

Assuming dataset.classes returns the folder names, the corresponding indices should be the class indices (i.e. the target values), shouldn’t it?

John_J_Watson · May 21, 2020, 10:17am

@ptrblck: thank you for your reply but I am not sure I understand what you have said

So, dataset.classes does return folder names, but when I loop through ds , the target is showing me indices of dataset.classes rather than the folder name

To the NN I would pass the sample and the target(which here is the folder name), not the indices, right?

I am confused as to why this is… I obviously do not understand something

So, when I do:

for idx, (sample, target) in enumerate(train_dataset):
    print(sample, list_of_classes[target] )

it seems to pick up the correct labels(aka target/folder_names). Is this the correct way to do it?

ptrblck · May 21, 2020, 10:19am

That is expected and the target should contain the class indices, not the names.
So e.g. for three folders your target should contain values in [0, 1, 2]
These target values are used to index the output of the model during the loss calculation, so they are used as numerical values instead of the folder names.

John_J_Watson · May 21, 2020, 10:30am

@ptrblck: thank you for this clarification. I think I got confused because my folder_names were also integers. It does make sense, since we compute the probability of classes at the end and thus indexing makes sense.

Thank you again. This was VERY helpful.