Hello, I am attempting to copy the data augmentation methods applied in a certain paper, but I don’t understand how to do it. (The work in the paper was done using TensorFlow.)
Note that I am a PyTorch noob.
Background (perhaps not important):
I am trying to see whether the application of a pre-trained CNN can improve on the results obtained in the paper, so I have some interest in trying to hold other things (such as data augmentation techniques) more or less constant, so it’s easier to suss out the effect the pre-trained network.
More detail:
I am working with a dataset containing approximately 14,000 images. In the aforementioned paper, the following augmentations are applied (ignoring resizing):
-five crop
-rotation by 0, 90, 180 and 270 degrees
-reflection about the line y = x.
This increases the dataset by a factor of 5 x 4 x 2 = 40, so from about 14,000 to about 560,000. According to the paper, these augmentations do not appear to be random, so I’m interested in applying them deterministically.
I am aware of torchvision.transforms.functional
, but I don’t understand how to use it. I will share some of my code below.
Code:
I’m only sharing part of my code, to avoid making this post too long. My code below uses random transforms because I’ve just been trying to get the network to train and test properly, which it does. When I try to use functional transforms I get errors.
My transforms:
The lines that are commented out is stuff I was experimenting with.
#import torchvision.transforms.functional as tf
img_height, img_width = 256, 256
size = [224, 224]
torch.manual_seed(17)
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Resize((img_height, img_width)),
transforms.RandomCrop(size),
transforms.RandomHorizontalFlip(),
transforms.RandomVerticalFlip(),
transforms.RandomRotation(180),
# transforms.FiveCrop(size),
# transforms.Lambda(lambda crops: torch.stack([transforms.ToTensor()(crop) for crop in crops])),
# transforms.Lambda(lambda crops: torch.stack([transforms.PILToTensor()(crop) for crop in crops])),
# tf.five_crop(tf.rotate(img, 90), size),
# tf.five_crop(tf.rotate(img, 180), size),
# tf.five_crop(tf.rotate(img, 270), size),
# tf.five_crop(tf.rotate(tf.vflip(img), 90), size),
# tf.five_crop(tf.rotate(tf.rotate(tf.vflip(img), 90), 90), size),
# tf.five_crop(tf.rotate(tf.rotate(tf.vflip(img), 90), 180), size),
# tf.five_crop(tf.rotate(tf.rotate(tf.vflip(img), 90), 270), size),
])
# def transform1(image):
# return tf.five_crop(tf.resize(image, (img_height, img_width)), size)
My code for training the model is directly copied from this link: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
My data loader and associated dictionary:
from torch.utils.data import DataLoader
train_dataloader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=64, shuffle=True)
dataloaders_dict = {
"train": train_dataloader,
#"test": test_dataloader # I think the code I'm using uses 'val' instead of 'test', so I'm renaming this to 'val' below
"val": test_dataloader
}
My hope is that (assuming it’s not too hard) someone can just show me how to write the desired transforms.
Thanks for any help.