Dynamic padding based on input shape

Andreas · March 10, 2020, 3:06pm

Hi,
For my model my input (image) needs to be divisible by 32 and I would like to pad my input dynamically to fit this requirement. Meaning if the input would be for example 520x520x3, I want it to be padded to 544x544x3.

torch.nn.functional.pad() requires the pad to be list of ints, but my paddings would be a tensor computed dynamically depending on the shape of the input.
The model (including the padding) is supposed to be exported to onnx so it has to be traceable.

Does anybody have a clue on how to approach this problem?
Thanks in advance,
Andreas

LukasUz · March 12, 2021, 11:14am

Hey, the reply is a little late, but I was struggling with the same problem. So I hope this will help somebody in the future. I solved my problem according to this example:

from torchvision import transforms
from torchvision.datasets.folder import pil_loader
from functools import partial
from math import ceil

p = "path/to/img.png"
i = pil_loader(p)

print("Shape:", i.size)
image_size = 544

def pad_to_minmum_size(min_size, image):
    h, w = image.size
    h_diff = h - min_size
    w_diff = w - min_size
    h_pad = ceil(abs(h_diff) / 2) if h_diff < 0 else 0
    w_pad = ceil(abs(w_diff) / 2) if w_diff < 0 else 0

    if h_pad == 0 and w_pad == 0:
        return image
    else:
        return transforms.functional.pad(image, [h_pad, w_pad])

transform = transforms.Compose([
    transforms.Lambda(partial(pad_to_minmum_size, image_size)),
    transforms.RandomCrop(image_size)
    ])
            

i_transformed = transform(i)
print("Shape after transform:", i_transformed.size)

Output:

Shape: (512, 383)
Shape after transform: (544, 544)

This should dynamically pad any image to the given image_size. Same code could be used for dynamic scaling as well.

Cheers!