Crop_and_resize in PyTorch

Philokey · May 29, 2017, 5:45am

Hello,

Is there anything like tensorflow’s crop_and_resize in torch? I want to use interpolation instead of roi_pooling.

1114 · May 29, 2017, 4:16pm

Maybe you can use spatial transform layer.

longcw · December 6, 2017, 8:56am

I ported the crop_and_resize from tensorflow: https://github.com/longcw/RoIAlign.pytorch.
Hope this is useful for you.

tensorboy1 · December 24, 2017, 12:47pm

That RoIAlign worked well.

sal · January 3, 2019, 10:37pm

Seriously ???

This is just running C src underneath. And copied from multimodallearnings MRCNN repo. Atleast give original author the credits.

And do not claim to port to pytorch when it is not.

longcw · January 4, 2019, 1:30am

Seriously??
It was ported from tensorflow source code and I mentioned this everywhere in the readme and here.

I didn’t know what is the multimodallearnings MRCNN repo you mentioned.
And it worked well for pytorch < 0.4. I recommend to use facebookresearch/maskrcnn-benchmark since the cffi api was changed after pytorch 0.4 and 1.0.
My porting was finished in Dec, 2017 and the maskrcnn_benchmark was starting from Oct, 2018. You can post the repo if you still think I copied any code from there. Your accusation is the biggest insult to a programmer.

And at least, please clarify the facts before posting a comment.

longcw · January 4, 2019, 1:56am

OK, I found the repo multimodallearning/pytorch-mask-rcnn and actually they are using my code.
This was also mentioned in the readme

We use functions from two more repositories that need to be build with the right --arch option for cuda support. The two functions are Non-Maximum Suppression from ruotianluo’s pytorch-faster-rcnn repository and longcw’s RoiAlign.

So this is a misunderstanding, and I am very glad that my code is useful to others.

sal · January 4, 2019, 3:42pm

My bad I apologize.
I should have checked the dates.
Yup, sure the code helps.
Is there any implementation more in a “pytorch” way not using C src. ? Say using the functional.interpolate() layer ?

Or could you give an understanding of the geometric transform of how you calculate “y_in” and “x_in” in the loop per box ?

I want to use this for a randomly oriented quadrilateral roi’s, defined as a rotated bounding box, with 8 coordinates as input as opposed to a horizontal box defined by 4 coordinates.

longcw · January 5, 2019, 11:05am

Check this for the roi pooling of pytorch way: https://github.com/ruotianluo/pytorch-faster-rcnn/commit/fa88df8c66528afc33cf8f900649504ebbc0335e#diff-69a3170ac05571b54972ad6d8ab782ddL125

This repo used F.grid_sample at the first and changed to my crop_and_resize in this commit. Then the newest one used roi_align from facebookresearch/maskrcnn-benchmark

hyesun · January 25, 2019, 5:03pm

Hi, thanks for your great work.
I’ve got one problem. I’ve seen a version of ROI Align, one of whose parameters is the spatial_scale, representing the scale to map the feature coordinate to the original image. For example, if the original image is 224x224 and the feature map is 14x14, then the spatial_scale is 16.
In your version of ROI Align there is no such parameter. So I guess the ROIs being fed to this function should be scaled before hand, e.g., they should be divided by 16 in the example above. Am I right?

viniciusarruda · February 6, 2020, 12:14am

Hi, I am trying to understand the meaning of the spatial_scale parameter, but the documentation is not clear to me.
Reading the source code the reverse order makes more sense for me, e.g., 14/224 = 0.0625.

yuri · October 24, 2021, 5:46am

please refer to torchvision.transforms — Torchvision 0.11.0 documentation