Faster/Mask RCNN RPN custom AnchorGenerator

Every time I define a new Anchor Generator, I get a CUDA OOM problem. I suspect it’s nothing to do with memory, there’s a weight mismatch somewhere. Here’s the code:

mrcnn_args = {'num_classes':63}
icdar_anchor_generator = AnchorGenerator(
      sizes=tuple([(4, 8, 16, 32, 64, 128, 256, 512) for r in range(5)]), 
      aspect_ratios = tuple([(0.25, 0.5, 1, 1.5, 2) for rh in range(5)]))
mrcnn_args['rpn_anchor_generator'] = icdar_anchor_generator
maskrcnn_model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=False, **mrcnn_args)

I followed the suggestion here:

If I simply defined one tuple for sizes of aspect views, I got a size mismatch:

icdar_anchor_generator = AnchorGenerator(
      sizes=(4, 8, 16, 32, 64), 
      aspect_ratios = (0.5, 1, 1.5))

RuntimeError: shape '[840000, -1]' is invalid for input of size 4476540

So what else should I define?

Did you resolve this?

Here is the correct way to do so

def get_instance_segmentation_model_anchors(num_classes):
 #load an instance segmentation model pre-trained on COCO
 model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=False)

 #create an anchor_generator for the FPN which by default has 5 outputs
 anchor_generator = AnchorGenerator(
 sizes=((16,), (32,), (64,), (128,), (256,))
 aspect_ratios=tuple([(0.25, 0.5, 1.0, 2.0) for _ in range(5)]))

 model.rpn.anchor_generator = anchor_generator

 # 256 because that's the number of features that FPN returns
 model.rpn.head = RPNHead(256, anchor_generator.num_anchors_per_location()[0])

 # get the number of input features for the classifier
 in_features = model.roi_heads.box_predictor.cls_score.in_features

 # replace the pre-trained head with a new one
 model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

 # now get the number of input features for the mask classifier
 in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels
 hidden_layer = 256

 # and replace the mask predictor with a new one
 model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask,

 return model

You can now call the model and validate the anchors sizes and aspect ratios as

model = get_instance_segmentation_model_anchors(num_of_classes)
print('Anchor Size :',model.rpn.anchor_generator.sizes)
print('Anchor Aspect ratio :',model.rpn.anchor_generator.aspect_ratios[0])