Mapping of old pixels to new using torchvision.transforms.functional.rotate

I have my own dataset of images, and labels of objects per image, each label is described as a set of (x,y) points forming a convex polygon.
I want to implement my own augmentation.
Specifically, I want to rotate using torchvision.transforms.functional.rotate(..., expand=True).
However I also need the transformation function, so I can apply it on my labels and get the new set of points defining the polygon of the object.
Is there a way, after calling rotate, to have the transformation map from old pixels to new pixels?

1 Like

@M_S would suggest instead of using the torch.utils for transforms, you can design your own with self designed transformations. You can import it and use it accordingly and should be able to help you out with you problem, ill add a link as an example -

Give it a look, Using your own file gives you more freedom to mess around with things.

Thanks for the answer,
That will require me to calculate the mapping function of a rotation with expanding by myself - my question was if such implementation already exists in the library (doesn’t have to be PyTorch native code but preferred as the rest of my augmentations are such)

Its very simple you just have to create a rotation matrix :

or you can check this post for suggested codes:

Yes, but as I mentioned that I require expansion, there is a translation followed after the rotation.
A solution I had in mind is to first pad the original images alot, which will then allow me to drop out expanding, and then indeed an easy calculation with a rotation matrix.
However I am still looking for an existing solution for this problem. If there’s no such, then I’ll implement…

There is a weird way (I soon will try it) to do so. Make a x,y grid transform it to PIL image and apply reverse transform for that grids and vuala you get mapping from old coordinates to new.

        w, h = img.width, img.height

        assert w == h

        grid_x = torch.arange(w).repeat(h, 1).view([h, w]).type(torch.FloatTensor) / w
        grid_y = torch.arange(h).repeat(w, 1).t().view([h, w]).type(torch.FloatTensor) / h
        dummy = torch.zeros((h, w))
        grid_xy = torch.stack([grid_x, dummy, grid_y])
        grid_xy = transforms.ToPILImage()(grid_xy)

        rot = 30
        img = img.rotate(rot, Image.NEAREST, expand=0)
        grid_xy = grid_xy.rotate(-rot, Image.NEAREST, expand=0)

        grid_xy = transforms.ToTensor()(grid_xy)
        x_orig = label[:, 2] * w
        y_orig = label[:, 3] * h

        label[:, 2] = grid_xy[0, y_orig, x_orig]
        label[:, 3] = grid_xy[2, y_orig, x_orig]

The most weird thing here is that you may apply ant number of transformations whats so ever without any geometry headache the only thing you care about is keep transforming grid_xy in reverse order