How to efficiently apply transforms?

oasjd7 · February 15, 2022, 10:06pm

I want to make two images which are applied with different strong augmentations.
I gave up configuring two datasets to which each augmentation was applied. Because the dataset is too large and the CPU failed to process it.

So, I tried compose only one dataset and make two versions of augmentations in each mini-batch. Below code is works well, but I think it’s very inefficient.
(I failed to convert 4D data (BxCxHxW) to PIL Image directly, so I use for-loop.
Please tell me a more efficient way to implement this.

>> In Dataset:
transforms.Compose([
                    transforms.Resize([256, 256])
                    transforms.ToTensor(),
                ]),

...
for step, data in enumerate(data_loader):
        import torchvision.transforms as transforms
        strong_trans = transforms.Compose([
            transforms.ToPILImage(),
            transforms.RandomCrop(size=224,
                                  padding=int(224 * 0.125),
                                  padding_mode='reflect'),
            transforms.RandAugment(),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
        ])

        img_strong_trans = []
        for x in data:
            img_strong_trans.append(strong_trans(x))

        img_strong_trans = torch.stack(img_strong_trans)

ptrblck · February 16, 2022, 6:18am

You could apply both transformations inside the Dataset.__getitem__ and return both transformed samples to avoid the additional transformation with the for loop in the DataLoader loop.