DataLoader for a VDSR network

Fabrice_Auzanneau · November 24, 2021, 8:42am

Hello all
I’m trying to run a VDSR network in Pytorch and I wonder how I should use the DataLoader.

I have made 4 directories:

HR: the ground truth images
LR_div2: each image from the HR folder is downsampled by a factor 2 then upsampled again using bilinear interpolation. They have the same name as the original image
LR_div3 : the same with factor 3
LR_div4 : the same with factor 4

Each folder has subfolders ‘train’ and ‘test’. Now I’m wondering how I can write the DataLoader.

I have found this piece of code:

class AutoEncoderDataSet(Dataset):
    def __init__(self, dir_lr, dir_gt, traintest, transform=None):
        self.dir_lr = self.load_dir_single(join(dir_lr, traintest))
        self.dir_gt = self.load_dir_single(join(dir_gt, traintest))
        self.transform = transform

    def is_image_file(self, filename):
        return any(filename.endswith(extension) for extension in [".png", ".PNG", ".jpg", ".JPG", ".jpeg", ".JPEG"])

    def load_img(self, filename):
        img = Image.open(filename)
        return img

    def load_dir_single(self, directory):
        return [join(directory, x) for x in listdir(directory) if self.is_image_file(x)]

    def __len__(self):
        return len(self.dir_in)

    def __getitem__(self, index):
        img_lr = self.load_img(self.dir_lr[index])
        img_gt = self.load_img(self.dir_gt[index])
        sample = {'img_in': img_lr, 'img_gt': img_gt}

        if self.transform:
            sample = self.transform(sample)

        return sample

together with

ps = {
    'DIR_LR2': PATH + 'LR_div2',
    'DIR_LR3': PATH + 'LR_div3',
    'DIR_LR4': PATH + 'LR_div4',
    'DIR_HR':  PATH + 'HR'
}
train_set = AutoEncoderDataSet(ps['DIR_LR2'], ps['DIR_HR'], 'train', composed)
train_loader = DataLoader(train_set, batch_size=BATCH_SIZE_TRAIN, shuffle=True, num_workers=4)

But how can I take into account all 3 LR folders?
Thanks for your help.

ptrblck · November 25, 2021, 9:56am

If you want to concatenate the samples from all 3 directories, you could add another class method which is able to accept multiple folders, could call load_dir_single on each one of them, and concatenate the samples in the end.

Fabrice_Auzanneau · November 25, 2021, 4:08pm

Thanks for your answer. I’m not sure to understand what you mean, as I’m quite new to pytorch.

I thought that the first argument of DataLoader should be a list of couples, made of an input image and its associated ground truth.
Do you suggest that it could be a list of 4 images (3 LR and the ground truth) ?

ptrblck · November 29, 2021, 3:06am

The first argument to a DataLoader is a dataset.
In your current code snippet you are already working with a custom Dataset and you can adapt it as needed. My suggestion was to pass all folders into AutoEncoderDataset and to load all directories into a single list.