Folks, I downloaded the flower’s dataset (images of 5 classes) which I load with ImageFolder. I then split the entire dataset using torch.utils.data.random_split into a training, validation and a testing set.
The issue I am finding is that I have two different transforms I want to apply. One for training which has data augmentation, another for validation and testing which does not.
Question: what is the best way to apply the two different transforms to the 3 datasets? Unfortunately it wont work to pass the transform to ImageFolder as it will do the same transform on all images/
I found split-folders that will split the dataset into training, testing and validation (https://pypi.org/project/split-folders) and then I guess I could use three different calls to ImageFolder to build the datasets with each of their transforms. Is there a better way by just using a single call to ImageFolder?
Thanks in advance for any help on this one!
Jacob
data_transform_train = transforms.Compose([
transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])
# Validation and Testing - just resize and crop the images
data_transform = transforms.Compose([
transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])
dataset = datasets.ImageFolder(data_dir, transform=data_transform_train)
train, val, test = torch.utils.data.random_split(dataset, [3459, 432, 432])
trainLoader = torch.utils.data.DataLoader(train, batch_size=batch_size,
num_workers=num_workers, drop_last=True, shuffle=True)
valLoader = torch.utils.data.DataLoader(val, batch_size=batch_size,
num_workers=num_workers, drop_last=True)
testLoader = torch.utils.data.DataLoader(test, batch_size=batch_size,
num_workers=num_workers, drop_last=True)