Different transformations for train and test dataset

Ali_Aghababaei · August 5, 2022, 4:18am

Hi everyone,
I have created a custom Dataset class for a number of images, and want to apply K-fold cross-validation on them to split them into train and test datasets (I don’t want to specify a classic validation dataset). The torchvision transforms for train and test datasets are different since I want to apply data augmentation on just the train dataset, not the test dataset. My question is that what is the best way to manage different transformations for each dataset in each fold. should I define a different class for each train or test dataset in each fold?

sudomaze · August 6, 2022, 1:11pm

Hi Ali,

You can create a function that returns a Compose and the parameter passed to it determines if this is a train or test operation. Here is an example:


import torchvision.transforms as T

def get_transforms(mode):
    transforms = []
    if mode == 'train':
        transforms.extend([
            # ...
        ])
    transforms.extend([
        T.Rescale((256,256)),
        T.ToTensor(),
        T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])
    return T.Compose(transforms)

train_set = Dataset(get_transforms('train'))
test_set = Dataset(get_transforms('test'))
# ...

You can refer to this tutorial for more examples.