I am training a model which requires variable sized inputs ((4, 4), (8, 8), (16, 16)) for data augmentation.
I am using the following the transform function to achieve the same
sizes = [(4, 4), (8, 8), (16, 16)]
ResizeTransformation = Transforms.RandomChoice(
[Transforms.Resize(size) for size in sizes])
transformList = [Transforms.RandomRotation(180, Image.BILINEAR),
Transforms.CenterCrop(220),
Transforms.RandomCrop(128),
Transforms.RandomHorizontalFlip(),
ResizeTransformation,
Transforms.ToTensor(),
Transforms.Normalize((0.1483, 0.1210, 0.1257), (0.1292, 0.1050, 0.1186))]
But this creates an issue when using a batch size > 1, As there are cases when images in the batch are resized to different dimensions.
I’m thinking that If we use the same seed value while transforming all the images in a batch, we can get a single tensor of shape (N, C, H, W) as the transform function will resize to the same dimensions.
I have looked up on the forum that we can create a list of tensors of batch size of different sizes by using a custom collate_fn, but then aren’t we training the network on a batch_size = 1 only as we’ll have to pass all the images to the networks separately since they are of different sizes?
Looking for a way to create single tensor of dimensions (N, C, H, W) with variable sized inputs.
TIA