Custom label for torchvision ImageFolder class

ASHUTOSH_CHANDRA · August 1, 2019, 10:34am

I’m using torchvision ImgaeFolder class to create my dataset. I’m using a custom loader function.

By default ImageFolder creates labels according to different directories. I want to change this behaviour to custom one. Means I want to assign labels to each image. How can I do that ?

ptrblck · August 2, 2019, 9:26pm

You could have a look at the implementation of ImageFolder and DatasetFolder and derive your own class using e.g. DatasetFolder as the parent class.

Would that work for you?

redtailedhawk · December 29, 2019, 2:07am

EDIT: creating as my own post.

I’m a little confused here. I created a custom Dataset, and in my init changed the classes to what I wanted them to be by calling a custom _find_classes method. (I wanted to use subfolders, and concatenate their names with the parents)This took my class count from something like 30 up to 964.

class MyDataset(Dataset):
    def __init__(self, image_path, transform=None):
        super(MyDataset, self).__init__()
        self.data = datasets.ImageFolder(image_path, transform)
        self.data.classes, self.data.class_to_idx = self._find_classes(image_path)

    def _find_classes(self, dir):
        # Custom labels

When I call class_to_idx, I have 964 different classes as expected, but when I create a DataLoader, it is giving me the old count of labels. If I print labels, nothing is higher than 29, or the old mapping to the parent folders.

data = DataLoader(data, batch_size=32, shuffle=True, num_workers=0)
data_iter = iter(train_dataloader)
images, labels = data_iter.next()

print(labels)

out:
tensor([ 0,  2, 19, 19, 29, 20,  3, 20, 22, 27,  1, 18,  4, 20, 17,  1,  3, 23,
        20, 18,  6, 29,  6, 18,  9, 24,  8, 29, 13, 19, 21, 14])

And if I print data.targets, it shows me the old mapping, before I tried to customize my labels.

It seems like I changed the classes on my data, but not my targets. What is a better way for me to do this more efficiently? It’s clear that my _find_classes has nothing to do with the _find_classes being called by ImageFolder, and I’m changing classes after the dataset has been created. What can I do?

ptrblck · December 29, 2019, 9:06am

The internal classes and class_to_idx attributes are used in DatasetFolder's __init__ to create the samples as seen in these lines of code.
After ImageFolder was initialized (DatasetFolder is the parent class of it), changing these values won’t have any effect on the samples anymore, so you might want to derive your custom class from DatasetFolder and change these attributes in the __init__ method.

redtailedhawk · December 30, 2019, 11:29pm

Wow, OK. So it goes deep. Thanks. So the main point to know about a dataset is that it’s just an iterable list containing a (path, class_to_idx[target]) tuple? And that’s what all dataloaders are looking for? Do I understand that correctly?

Ultimately then, I’ll want to correct what my data.targets are getting assigned as either before it comes out of the DataFolder object, or to overwrite them somehow afterwards (if possible). Pretty nifty stuff.

Thanks again

ptrblck · December 31, 2019, 3:22am

I would write a custom Dataset deriving from DatasetFolder as the parent class.
In the __init__ method, use your custom method to calculate the class_to_idx mapping, then apply other methods, if desired, as e.g. datasets.folder.make_dataset.
This would probably be the easiest and cleanest approach.

Let me know, if you get stuck somewhere.

redtailedhawk · December 31, 2019, 6:39am

@ptrblck It was rough at the end but I got it to work. I didn’t subclass DatasetFolder. Here’s what I did. I wonder if you have some tips for how to have done this simpler, it was a doozy. At the end, by the time I got to loaders, I just straight up copy-pasta’d the 3 functions from the code you linked to earlier.

from pathlib2 import Path

class MyDataset(Dataset):
    def __init__(self, image_path, transform=None):
        """Imports dataset from folder structure.

        Args:
            image_path: (string) Folder where the image samples are kept.
            transform: (Object) Image processing transformations.

        Attributes: 
            classes: (list) List of the class names.
            class_to_idx: (dict) pairs of (class_name, class_index).
            samples: (list) List of (sample_path, class_index) tuples.
            targets: (list) class_index value for each image in dataset.
        """

        super(MyDataset, self).__init__()
        # ORIGINAL STRATEGY - Couldn't customize labels to include broader
        # folder structures.
        # self.data = datasets.ImageFolder(image_path, transform)

        self.transform = transform
        self.classes, self.class_to_idx = self._find_classes(image_path)
        self.samples = self.make_dataset(image_path, self.class_to_idx)
        self.targets = [s[1] for s in self.samples]

    def _find_classes(self, dir):
        """Creates classes from the folder structure.

        Args:
            dir: (string) Root directory path.

        Returns:
            tuple: (classes, class_to_idx) where classes are relative to (dir),
            and class_to_idx is a dictionary.
        """

        classes = []

        for d in dir.rglob("*"):
            if d.is_dir():
                # Split path into strings 
                parts = d.parts # type: tuple, strings
                item = f"{parts[-2]}_{parts[-1]}"
                classes.append(item)

        classes.sort()
        class_to_idx = {classes[i]: i for i in range(len(classes))}

        return classes, class_to_idx

    def _get_target(self, file_path):
        """Returns a target_class from the parent and grandparent folders.

        Args:
            file_path: (string) path to the file.

        Returns:
            target_class: (string) target class for that file.
        """

        parts = file_path.parts
        target_class = f"{parts[-3]}_{parts[-2]}"
        return target_class

    def make_dataset(self, dir, class_to_idx):
        """Returns a list of image path, and target index

        Args:
            dir: (string) The path of each image sample
            class_to_idx: (dict: string, int) Sorted classes, mapped to int

        Returns:
            images: (list of tuples) Path and mapped class for each sample
        """

        images = []

        dir = Path.expanduser(dir)

        for d in dir.rglob("*.png"):
            if not d.is_dir():
                target = self._get_target(d)
                item = (d, class_to_idx[target])
                images.append(item)

        return images

    def get_class_dict(self):
        """Returns a dictionary of classes mapped to indicies."""
        return self.class_to_idx

    def __getitem__(self, index):
        """Returns tuple: (tensor, int) where target is class_index of
        target_class.
        
        Args:
            idx: (int) Index.
        """

        path, target = self.samples[index]
        sample = default_loader(path)
        sample = self.transform(sample)

        return sample, target

    def __len__(self):
        return len(self.samples)


def pil_loader(path):
    # open path as file to avoid ResourceWarning
    # (https://github.com/python-pillow/Pillow/issues/835)
    with open(path, 'rb') as f:
        img = Image.open(f)
        return img.convert('RGB')


def accimage_loader(path):
    import accimage
    try:
        return accimage.Image(path)
    except IOError:
        # Potentially a decoding problem, fall back to PIL.Image
        return pil_loader(path)


def default_loader(path):
    from torchvision import get_image_backend
    if get_image_backend() == 'accimage':
        return accimage_loader(path)
    else:
        return pil_loader(path)

redtailedhawk · December 31, 2019, 8:20pm

I also have a question about types. My classes, samples, targets etc from MyDataset object are lists and dicts and my model is training fine on my CPU. Is it good practice to explicitly convert them to tensors?

ptrblck · January 3, 2020, 6:47am

The mentioned container classes should work fine, since usually you would unpack the tensors inside your training loop and push them to the device.

o_v_shake · April 17, 2021, 7:00pm

I was able to do the changes I needed on the labels by exploiting the target_transform parameter.

class ImageNet1k:
    def __init__(self, path='/mnt/data/rajivratn/maiti/imagenet/imagenet-val'): 
        self.valdir = path
        self.data = None
    
    def make_dataset(self):
        normalize = transforms.Normalize(mean=[0.5, 0.5, 0.5], 
                                        std=[0.5, 0.5, 0.5])
        self.data = datasets.ImageFolder(self.valdir, transforms.Compose([
                transforms.Resize(256),
                transforms.CenterCrop(224),
                transforms.ToTensor(),
                normalize,
            ]))
        self.class_to_idx = self.data.class_to_idx 
        self.idx_to_class = {v: int(k) for k,v in self.class_to_idx.items()}
        return self.data

    def get_val_dataloader(self, batch_size, num_workers=0):
        if self.data is None:
             self.make_dataset() 
        val_loader = torch.utils.data.DataLoader(
            self.data,
            batch_size=batch_size, shuffle=False,
            num_workers=num_workers, pin_memory=False)
        
        return val_loader


if __name__ == '__main__':
    args = parseargs() 
    if args.gpu == -1:
        device = 'cpu'
    else:
        device = f'cuda:{args.gpu}' if torch.cuda.is_available() else 'cpu'
    model = vit_base_patch16_224(pretrained=True) 
    model = model.to(device)
    in1k = ImageNet1k() 
    val_dataloader = in1k.get_val_dataloader(batch_size=args.batch_size) 
    def target_transform(label):
        return in1k.idx_to_class[label] 
    
    in1k.data.target_transform = target_transform
    
    model.eval() 
    print(f'==> Training state of mode: {model.training}')
    total = 0
    correct = 0
    for batch_idx, (images, labels) in tqdm(enumerate(val_dataloader)):
            images, labels = images.to(device), labels.to(device) 
            y_preds = model(images) 
            y_preds = torch.argmax(y_preds, dim=1)
            # labels = torch.tensor([in1k.idx_to_class[label.item()] for label in labels]).to(device)
            correct += (y_preds == labels).sum()
            total += y_preds.shape[0]
    print(f'Current Accuracy: {correct / total:.2f}')