Best way to extract image patches around a set of landmarks

tdeboissiere · July 10, 2017, 1:46am

Gday all,

say I have

a batch of images (batch_size, channel, height, width).
a batch of landmarks (batch_size, n_landmarks, 2) (i.e. for each image, we have n_landmarks, each landmark is a (x,y) tuple of coordinates in the image)

For each image, I want to extract a patch of size (channel, patch_height, width_height) around each of the corresponding landmarks. The tensorflow equivalent is https://www.tensorflow.org/api_docs/python/tf/image/extract_glimpse

I have a numpy implementation here: https://gist.github.com/tdeboissiere/4e1ff2de0ceae5704bbaf06ceb9fd301

This patch extraction occurs several times within the pipeline so I’d like to do it on the GPU side. Is there any smart way to do it without having to write tons of loops ?

Frida · March 1, 2020, 2:45pm

solution?

marioviti · May 14, 2020, 4:49pm

I’m also very interested!

sjzcv · April 8, 2021, 2:57am

Here is an implementation:

github.com

jimmysue/xvision/blob/main/xvision/ops/extract_glimpse.py#L14


from typing import Union, Tuple

"""
tf.image.extract_glimpse(
    input, size, offsets, centered=True, normalized=True, noise='uniform',
    name=None
)
"""


def extract_glimpse(input: torch.Tensor, size: Tuple[int, int], offsets, centered=True, normalized=True, mode='bilinear'):
    # similar usage with ft.image.extract_glimpse:
    #   https://www.tensorflow.org/api_docs/python/tf/image/extract_glimpse
    # input: [B, C, H, W]
    # size:  [int, int]  specified the size of glimpse, height comes first
    # offsets: [B, 2]
    W, H = input.size(-1), input.size(-2)

    if normalized and centered:
        offsets = (offsets + 1) * offsets.new_tensor([W/2, H/2])
    elif normalized: