Data augmentation folr labels and images?

lavender99 · February 26, 2019, 9:16pm

classMydata(data.Dataset):
    def __init__(self,root_dir,seg_dir,transforms = None):
        self.root_dir = 'training path'
        self.seg_dir = 'label'
        self.transforms = transforms
        self.files = os.listdir(self.root_dir)
        self.lables = os.listdir(self.seg_dir)
    
    def __len__(self):
        return len(self.files)
    
    def __getitem__(self,idx):
        img_name = self.files[idx]
        label_name = self.lables[idx]
        img = Image.open(os.path.join(self.root_dir,img_name))
        label = Image.open(os.path.join(self.seg_dir,label_name))
        if self.transforms:
            img = self.transforms(img)
            label = self.transforms(label)
            return img,label
        else:
            return img, label
full_dataset = Mydata('/training ',
                                     'label',
                                    transforms=tfms.Compose([tfms.Resize((128,128)),tfms.ToTensor(),
                                                             
                                                       ]))
train_size = int(0.8 * len(full_dataset))
val_size = len(full_dataset) - train_size
train_dataset, val_dataset = torch.utils.data.random_split(full_dataset, [train_size, val_size])

in the code above i am trying to do data augmentation / affline
i do not know if they are similer or not , ,
how i can do it ? is suppost to be done after dividing to val and training or befor ?

ptrblck · February 26, 2019, 10:53pm

What do you mean by “they are similar”?
The transformations will be applied on both datasets after splitting.

lavender99 · February 27, 2019, 8:30am

by similar i mean data augmentation and affline transformation .

ptrblck · February 27, 2019, 12:09pm

You can use affine transformations like rotation as data augmentation.
I’m still not clear, if I misunderstand the question, but since you passed the transformations to your Dataset, they will be used for each sample (also after splitting the Dataset).

lavender99 · February 27, 2019, 8:14pm

sorry my question was not clear , what i mean after dividing fulldataset to val and training

train_size = int(0.8 * len(full_dataset))
val_size = len(full_dataset) - train_size
train_dataset, val_dataset = torch.utils.data.random_split(full_dataset, [train_size, val_size]
)```

`data augmentation supposed to happen after this point, or  after 



train_loader = data.DataLoader(train_dataset,shuffle=False,batch_size=bs)
val_loader = data.DataLoader(val_dataset,shuffle=False,batch_size=bs)

ptrblck · February 27, 2019, 8:59pm

Thanks for clarifying the question, as I’ve indeed misunderstood it.
The data augmentation (transformation) will be applied lazily, i.e. while each sample if being loaded.
E.g. if you get the sample at index 0 using x, y = train_dataset[0], the transformations will be applied live at this line of code while executing __getitem__.
The same applies for drawing batches from your DataLoader. While creating the batch, each sample will be drawn from the Dataset by calling its __getitem__, such that the transformation will be applied live again.
Does this answer your question?