Metadata in dataloader (such as file names)

Is there a way to save the file name for each file in the test and train data set into the data structure dataloader creates?

For example, if I retrieve a particular piece of data from dataloader can I get the filename that particular piece of data was created from?

I am doing image analysis and I would like to be able to go back to the original image file to compare (1) any manipulation done on the image on loading such as normalization, and (2) to compare predictions with metadata available for the original image.

train_data = datasets.ImageFolder(DEST_PATH+‘train/’, transform = transform)
train_data_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=2)
for i, (images, lbls) in enumerate(train_data_loader, 0):

At this point, is there a way to get the filename that created “i, (images,lbls)”?


The DataLoader doesn’t have any knowledge about the dataset besides what is being returned by Dataset.__getitem__. You could thus return the image names in the __getitem__ method additionally to the data and target, and could then use it in the DataLoader loop.

Thanks for the response!
So, I would make a local/custom version of DataLoader with the changes in the getitem method?

You would create a custom Dataset (not DataLoader) which would then return the image names in its __getitem__.