Data augmentation (transferring to different locations)

Hi, I am looking for applying data augmentation on CIFAR-10, in the way that each image takes different places, let’s say one in the right corner top, another one in the left corner bottom, … I am not sure this is a correct approach, first creating a box around each image by using padding then start transferring images there? I wonder using transforms.RandomAffine is okay for transferring in this way? if so why “translate”, option in that, has to be between 0 and 1?

Thank you,

Would you like to crop the input image at these different positions?
If so, torchvsion.transforms.FiveCrop (or TenCrop) might be usable.

The translate argument is defined as:

translate ( tuple , optional ) – tuple of maximum absolute fraction for horizontal and vertical translations. For example translate=(a, b), then horizontal shift is randomly sampled in the range -img_width * a < dx < img_width * a and vertical shift is randomly sampled in the range -img_height * b < dy < img_height * b. Will not translate by default.

which means it’s used as the relative translation using the image width and height.

Thank you for your response, @ptrblck.

To make sure that I understand it properly, you mean replacing

transform_train = transforms.Compose([
            transforms.RandomCrop(32, padding=4),        
            transforms.RandomRotation(15),              
            transforms.RandomHorizontalFlip(),           
            transforms.ToTensor(),                     
            normalize_transform,                          
        ])

with

transform_train = transforms.Compose([
            transforms.FiveCrop (40),
            transforms.Lambda(lambda crops: torch.stack([ToTensor()(crop) for crop in crops])) 
            transforms.RandomRotation(15),              
            transforms.RandomHorizontalFlip(),           
            normalize_transform,                          
        ])

and model(input) in training procedure should be changed to:

result = model(input.view(-1, c=3, h=40, w=40)) # fuse batch size and ncrops
result_avg = result.view(bs=128, ncrops=5, -1).mean(1) # avg over crops

and the result_avg will be used for calculating loss function, prediction?

This generates extra data to train in different locations, and the size should be increased for that (let say from 32 to 40) to see some difference?

I think your new transformation won’t work, as you are trying to apply image transformations (e.g. RandomRotation) on tensors.

Instead you could apply FiveCrop, loop over each cropped image, and apply the rotation to it.

I’m not sure what size you are referring to in the last sentence. CIFAR-10 images have a spatial size of 32x32. You could of course pad them to 40x40 and check, if that helps.

Thank you, @ptrblck.