Get location of RandomCrop

Hi,
Using torchvision.transforms.RandomCrop is there a way to get the location from where the crop was taken.
For example the coordinates of the top left corner of the crop in the original image.

Thanks

Hello Tristan,

I’m not sure if there’s a way to do this directly, but you can write your own custom transform that does retain the coordinates, while piggybacking on the RandomCrop sampler, per ptrblck’s suggestion here.

Specifically, this bit:

self.crop_indices = transforms.RandomCrop.get_params(
                image, output_size=(512, 512))
i, j, h, w = self.crop_indices  # btw i think h, w are just always going to be 512, 512 here
image = TF.crop(image, i, j, h, w)

If you post your solution in a public repo perhaps others would use it as well. I can imagine the usefulness extends to other Random* transforms.

Best,
Andrei

Hello Andrei,
Thanks, the solution you provided works to some extent.
However the transforms.RandomCrop can do crops with padding.
I tried to do :
transforms.RandomCrop(size=(224,224), padding=128).get_params( image, output_size=(224, 224))
But never get parameters that extend over the boarder of the image.
Also the TF.crop would not be able to handle any paddings.

Hi -

For the padded case, what if you just pad the image in a previous step, and then repeat the suggested approach? Something like:

padded_image = transforms.Pad(128)(image)
i, j, _, _ = transforms.RandomCrop(size=(224,224)).get_params(padded_image, output_size=(224, 224))
cropped_image = padded_image.crop((j, i, j + 224, i + 224))
print(i, j)  # below notice i < 128 which means we're extending into the padded area

Output:
96 1185

You can convert back to the old coords by just subtracting 128. Of course, you’ll get a negative coordinate if you were in the padded area to begin with.

Would this work?

Hi,
thanks, Yes this is also the solution I came up with.

In addition to @Andrei_Cristea’s answer, here is an implementation of a random crop that you could adapt for your purpose: