DataLoader Filenames in each Batch

Hi, everyone,

I’m pretty new to PyTorch and am working with DataLoader to wrap my own image dataset. Suppose I’ve trained a binary image classifier, now I want to use this model to pick out the images that this model has misclassified. How can I get this done?

I’m thinking about working with the DataLoader class, but what I can get is only the transformed tensor batches from sampled images. I know which tensor in each batch is misclassified, but it’s hard for me to get access to their filenames. Anyone have similar concerns and can provide any workaround? Any suggestion is very welcome.

I really appreciate your help. Thanks.

3 Likes

you can write a custom Dataset that not only returns the images but also their ids / paths.
For example:


can be

return image, target, path

or

return image, target, index
3 Likes

Great. Thanks a lot.

For anyone who stumbles upon this later, I made a convenient little gist for this:

14 Likes

it work, Thank you! :slight_smile: