How to make a ImageFolder using absolute image ways?

Hello, I am a bit new in pytorch, so I have a problem now with uploading images.

I want to do Over-Sampling with my data. So I decided to make a dictionary where key is class label and values are absolute ways to images of this class. Then I count amount of images of every class and if this amout < 100 I increase the number of paths. But because of transforms this trick should work, I think. So, now I need to make an ImageFolder with images from this paths, but I, actually, don’t know how, because ImageFolder requires subfolders.

You could write a custom Dataset and lazily load each sample from your dict in the __getitem__ method.
This tutorial gives you an example on how to write a Dataset.
Let me know, if that works for you.

Am I right that when I will put CustomDataset into the DataLoader, DataLoader will raise __getitem__ for every item in CustomDataset?

Yes, that is correct, if you are using the default sampling and collate mechanism.
There are ways to provide a batch of indices, but this would be a special case.

So, I found custom Dataset, that could be good for me.
But in case of my task and using Colab, it’s better for me to increase amount of images in the folders of classes.
I just use ImageFolder to create validation ImageFolder, then I activate function to copy missing pictures (because of it’s veeeeeery custom I wouldn’t post it without any reason :slight_smile: ) and then I create train Image folder.

But there some parts of that Dataset realisation, maybe it’ll be useful for someone:

At the beginning we __init__ list of paths:


def __init__(self, files, mode):
super().__init__()

    # list of files to upload
    self.files = sorted(files)

Then we define upload function:

def load_sample(self, file):
image = Image.open(file)
image.load()
return image

Notice that you should use from PIL import Image

And at the end we define __getitem__:

def __getitem__(self, index):

    x = self.load_sample(self.files[index])


return x