How to load Images without using 'ImageFolder'

alx · November 4, 2019, 11:27pm

Hey! In all my previous work I’ve always used:

torchvision.datasets.ImageFolder(root='', transform=trsfm)

I am now in a situation where my data is not split into subfolder:
root/dog/xxx.png
root/cat/asd932_.png

but rather:
root/asd932_.png
root/asd933_.png

What I’m trying to do is predict new images (with no labels) from a trained model.

How do people load there images when working on an unsupervised model?

JuanFMontesinos · November 5, 2019, 12:31am

You can write your custom dataset https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset
and to load any kind of data format.

zimmer550 · November 5, 2019, 2:18am

Write your own dataset class:

class CustomDataSet(Dataset):
    def __init__(self, main_dir, transform):
        self.main_dir = main_dir
        self.transform = transform
        all_imgs = os.listdir(main_dir)
        self.total_imgs = natsort.natsorted(all_imgs)

    def __len__(self):
        return len(self.total_imgs)

    def __getitem__(self, idx):
        img_loc = os.path.join(self.main_dir, self.total_imgs[idx])
        image = Image.open(img_loc).convert("RGB")
        tensor_image = self.transform(image)
        return tensor_image

Then call the Dataset class and Dataloader:

my_dataset = CustomDataSet(img_folder_path, transform=trsfm)
train_loader = data.DataLoader(my_dataset , batch_size=batch_size, shuffle=False, 
                               num_workers=4, drop_last=True)

Then iterate:

for idx, img in enumerate(train_loader):
    #do your training now

alx · November 9, 2019, 4:28am

NameError: name 'Dataset' is not defined?

sperry · November 9, 2019, 5:24am

from torch.utils.data import Dataset

alx · November 9, 2019, 5:25am

Thanks! I think there’s another mistake
self.total_imgs = natsort.natsorted(all_imgs)
to
self.total_imgs = natsorted(all_imgs)

Do you have a custom loader as well?

sperry · November 9, 2019, 5:54am

Yes, it’s pretty common to write your own.

You have a lot of freedom to implement the len and getitem methods to accommodate your use case, folder structure, etc.

len needs to return the size of the dataset

getitem needs to return your image tensor for the image with index ‘idx’. You can also return labels, bounding boxes, etc as required for training.

Transforms can be leveraged, but aren’t required.

The sample code above should work… Just pass in your image folder path when you instantiate the DataSet.

ex: my_dataset = CustomDataSet(“path/to/root/”, transform=your_transforms)

If you aren’t using transforms, remove the 3 references to transform in your CustomDataSet code.

dcprime · July 15, 2020, 6:28pm

I know this is a bit of an old discussion at this point, but I’m looking for some similar help.

I need to load data into a 3D CNN. My data are sets of JPG images (“video frames”) that represent gestures being performed.

How can I create a custom Dataset class to accommodate this situation? And how do my classification labels get added to my data for training purposes?

Thanks for your help!

ptrblck · July 16, 2020, 7:06am

It depends a bit on the current structure of your data.
Generally, if you are implementing a custom Dataset, you would need to implement:

the __getitem__(self, index) method, which uses the passed index to load a single “sample” of the dataset
the __len__(self) method, which returns the length of the dataset and thus defines the indices to be sampled from the range [0, self.__len__()]
as well as the __init__ method, if you want, to pass e.g. image paths, tansformations etc.

If your images are all in a single folder and can be loaded sequentially, you could define a window_length in the __init__ and load multiple frames in __getitem__. I assume the target would be a single class index for the complete sequence.

It’s a bit hard to give a good recommendation without knowing your use case.

dcprime · July 16, 2020, 12:13pm

To describe my use case in a bit more detail:

We are attempting to classify gestures using a 3D CNN. Each gesture is captured as a series of JPG frames and indexed as a start_frame number and an end_frame number. There are 8 gestures to identify altogether.

Currently the data is saved in a separate directory for each Subject who performed the 8 gestures, but it’s our own dataset and we can alter the directory structure in any way we choose.

One aspect I don’t understand is how the network is made aware of the class labels. Are the classes (output targets) derived from the directory structure by DataLoader() somehow? And how can I separate my training and validation data appropriately?

ptrblck · July 17, 2020, 12:02pm

You can define the targets as you wish.

The DataLoader is not responsible for the data and target creation, but allows you to automatically create batches, use multiprocessing to load the data in the background, use custom samplers, shuffle the dataset etc.

The Dataset defines how the data and target samples are created.
torchvision.datasets provide some “standard approaches”, such as ImageFolder, which creates the targets based on the subfolders passed in the root argument (subfolders will be sorted and each folder will get an own class label).
However, this use case might be too limited for your use case, so that I would recommend to implement a custom Dataset.
This tutorial gives you a good overview on creating a Dataset and use a DataLoader.

As described, the __getitem__ method would be responsible to load the data and target (or create the target based on some properties).
Before changing the data structure, let’s think about how the batches should be created.
I.e. would it be possible to create a batch given a single index using the current data structure?

dcprime · July 22, 2020, 11:25am

Thanks for your help! I was able to write a custom Dataset object that worked for me.

I just needed to understand that Dataset’s job is to return a single training example, and how the target for that example is returned by Dataset.

akib62 · December 8, 2020, 5:31pm

Hello @sperry @zimmer550 ,

I am using the same code you gave here for creating the custom dataset. Then, as per your guidance, I am giving the path of my dataset like below:

train_data_dir = '/home/Houses-dataset-master'

# Transformation
transform = transforms.Compose([
transforms.Resize((256, 256)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

# Giving the path 
train_data_tensor = CustomDataSet(train_data_dir, transform=transform)

# trying to  print the length of the train_data_tensor
print(len(train_data_tensor))

But, first I was getting an error for natsort

 NameError: name 'natsort' is not defined

Then, I remove the natsort and replaced all total_imgs with all_imags.

But, getting another error:

 AttributeError: 'CustomDataSet' object has no attribute 'all_imgs'

Could you tell me, how can I solve this error?

Updated: Full error problem is given in here

kkris88 · August 17, 2021, 5:43am

you can pip install natsort