Hello all,
New to PyTorch and deep learning. I am running into an issue regarding applying transforms to my training and test subsets. I have a combined dataset, in which I used the scikit learn train test split to separate into my training and test sets.
combined_dataset = datasets.ImageFolder(“DiBAS-Images/train”, transform=None)
def train_val_split(dataset, val_split=0.25):
train_idx, val_idx = train_test_split(list(range(len(dataset))), test_size=val_split)
dataset_train = torch.utils.data.Subset(dataset, train_idx)
dataset_val = torch.utils.data.Subset(dataset, val_idx)
return dataset_train, dataset_val
dataset_train, dataset_val = train_val_split(combined_dataset)
I wanted to apply separate transforms to my train and test sets. For my training set:
transform = transforms.Compose([transforms.Resize((224,224)),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(10),
transforms.RandomAffine(0, shear=10, scale=(0.8, 1.2)),
transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
transforms.ToTensor(),
transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])
and for my test set:
transform = transforms.Compose([transforms.Resize((224,224)),
transforms.ToTensor(), #Mean # Standard Deviation
transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])
I am following the method outlined here:
dataset_train.dataset.transform = transforms.Compose([transforms.Resize((224,224)),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(10),
transforms.RandomAffine(0, shear=10, scale=(0.8, 1.2)),
transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
transforms.ToTensor(),
transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])
However, I am not sure if this applies the transform to both subsets, as the dataset_train. dataset method accesses the combined dataset. Am I correct in this assumption, and if so, how would one apply two separate transforms to two separate subsets from the same combined dataset?