Mask R-CNN native PyTorch Implementation


I was currently working with Mask R-CNN and was reading a wonderful article about it.

In my project we already know which parts of the image we want to proceed with box and mask heads of the networks. Using roi_heads I can get the predictions for mask and and bounding boxes but have some problems with processing it and converting to the original image size.

Following the article above the mask logits postprocessing look like this

And my question is where I can find these steps in the native PyTorch implementation. I tried to look at the torchvision/detection/roi_align. py and transforms, but still do not understand how it works.

I would appreciate any help.
Thank you!