Manually select batches from DataLoader

Hi,
I have been following some pytorch tutorials where they use datasets.ImageFolder and then utils.data.DataLoader to iterate across the data set. This, however is a big problem for me, as the task I am trying to solve strictly requires me to have full knowledge on which specific image I am passing through my neural network.

In my task I am comparing images, and the images are numbered, and so when comparing some images with my net I need to be able to specify that I wish to run image β€˜i’ and then image β€˜j’ through the neural network.

This is how I am imagining it:

image_datasets = datasets.ImageFolder(os.path.join(data_dir, 'food'), transform= data_transforms)

dataloaders = torch.utils.data.DataLoader(image_datasets, batch_size=1, shuffle=False, num_workers=0)

output1 = model(dataloaders[i])
output2 = model(dataloaders[j])

Nothing like this seems to exist. How can I access data of my choice?

If you don’t need the batching, shuffling, or the usage of multiple workers from the DataLoader, you could directly access the image_datasets with the index.
Note that you would need to add a batch dimension to your data and target, since the Dataset will not create this dimension for you.

Alternatively, if you want to use the DataLoader, I would recommend to implement a custom sampler, which gives you the ability to specify which indices are returned in which step.

1 Like

The first one worked, thanks for the help!