Resnet101 as backbone for FasterRCNN? (From classification to detection)

Ref this tutorial: TorchVision Object Detection Finetuning Tutorial — PyTorch Tutorials 1.8.1+cu102 documentation, where “Section 2 - Modifying the model to add a different backbone” reads:
# load a pre-trained model for classification and return
# only the features
backbone = torchvision.models.mobilenet_v2(pretrained=True).features
# FasterRCNN needs to know the number of
# output channels in a backbone. For mobilenet_v2, it’s 1280
# so we need to add it here
backbone.out_channels = 1280

Is it possible to use a pretrained Resnet101 (not fpn) as a backbone without rewriting a lot of code?
I found no obvious feature variable to use in Resnet101.

Sorry if this is a silly question.

Hi,

You just need to prune ResNet101 at fully connected layer.

Thanks, omarfoq!
I kind of figured that, but don’t know how. A specific tip would be most welcome :-).

Actually, I cannot get the Tutorial code to work, even. This is the code from the section " 2 - Modifying the model to add a different backbone" (which I believe is not used in the rest of the tutorial):

import torch
import torchvision
from torchvision.models.detection import FasterRCNN
from torchvision.models.detection.rpn import AnchorGenerator

backbone = torchvision.models.mobilenet_v2(pretrained=True).features
backbone.out_channels = 1280
anchor_generator = AnchorGenerator(sizes=((32, 64, 128, 256, 512),),
                                   aspect_ratios=((0.5, 1.0, 2.0),))
roi_pooler = torchvision.ops.MultiScaleRoIAlign(featmap_names=[0],
                                                output_size=7,
                                                sampling_ratio=2)
model = FasterRCNN(backbone,
                   num_classes=2,
                   rpn_anchor_generator=anchor_generator,
                   box_roi_pool=roi_pooler)

I try to test it like this:

model.eval()
x = [torch.rand(3, 300, 400), ]
predictions = model(x)

This generates the error message below. What am I doing wrong here? It may be related to a comment that indicates something about features in the general case being dictionaries?

  File ".../tutorial_error.py", line 19, in <module>
    predictions = model(x)
  File "...\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "...\venv\lib\site-packages\torchvision\models\detection\generalized_rcnn.py", line 100, in forward
    detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
  File "...\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "...\venv\lib\site-packages\torchvision\models\detection\roi_heads.py", line 752, in forward
    box_features = self.box_roi_pool(features, proposals, image_shapes)
  File "...\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "...\venv\lib\site-packages\torchvision\ops\poolers.py", line 207, in forward
    self.setup_scales(x_filtered, image_shapes)
  File "...\venv\lib\site-packages\torchvision\ops\poolers.py", line 176, in setup_scales
    lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item()
IndexError: list index out of range

You can do it as follows

from torchvision.models.detection.backbone_utils import resnet_fpn_backbone

backbone = resnet_fpn_backbone(backbone_name = "resnet101", pretrained = True)

num_classes = num_labels + 1 (+1 for background)

fasterrcnn = FasterRCNN(backbone = backbone, num_classes = num_classes)