How to resize and pad in a torchvision.transforms.Compose()?

ntomita · July 13, 2021, 8:56pm

I would extend the @weisunding 's code to be more precise as follows.

import torchvision.transforms.functional as F

class SquarePad:
    def __call__(self, image):
        max_wh = max(image.size)
        p_left, p_top = [(max_wh - s) // 2 for s in image.size]
        p_right, p_bottom = [max_wh - (s+pad) for s, pad in zip(image.size, [p_left, p_top])]
        padding = (p_left, p_top, p_right, p_bottom)
        return F.pad(image, padding, 0, 'constant')

target_image_size = (224, 224)  # as an example
# now use it as the replacement of transforms.Pad class
transform=transforms.Compose([
    SquarePad(),
    transforms.Resize(target_image_size),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])

The original code fails to return a square image when either side is odd. The code above guarantees to return a square image by computing residuals (See a line start with p_right). It also removes np for less dependency and CenterCrop as it does not do anything after Resize with the same size.