Feature extracting from Resnet pretrained on COCO

I need a Resnet pretrained on COCO, so I opted to use:


However, all I need are the Resnet components for simple classification. How do I isolate that model without all of the extra FasterRCNN parts?

Would just using that model’s .backbone.body work?

Yeah, but not out of the box.
Note that the FasterRCNN ResNet backbone uses FrozenBatchNorm layers, while you would most likely want to use the trainable batchnorm layers.
Also, the last linear layer is missing and the output is different.

However, there might be a hacky way to grab the backbone’s trained parameters.

First you would have to check for all FrozenBatchNorm layers and replace them with their equivalent nn.BatchNorm2d layers.
Then small fixes in the architecture naming (model.bn1.bn1 in particular) and finally you could create a reference resnet50 model and load the trained parameters from the FasterRCNN backbone.

Here is the code for this work flow:

# Create the model
model = models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
resnet = model.backbone.body

# Check for all FrozenBN layers
bn_to_replace = []
for name, module in resnet.named_modules():
    if isinstance(module, torchvision.ops.misc.FrozenBatchNorm2d):
        print('adding ', name)

# Iterate all layers to change
for layer_name in bn_to_replace:
    # Check if name is nested
    *parent, child = layer_name.split('.')
    # Nested
    if len(parent) > 0:
        # Get parent modules
        m = resnet.__getattr__(parent[0])
        for p in parent[1:]:    
            m = m.__getattr__(p)
        # Get the FrozenBN layer
        orig_layer = m.__getattr__(child)
        m = resnet.__getattr__(child)
        orig_layer = copy.deepcopy(m) # deepcopy, otherwise you'll get an infinite recusrsion
    # Add your layer here
    in_channels = orig_layer.weight.shape[0]
    bn = nn.BatchNorm2d(in_channels)
    with torch.no_grad():
        bn.weight = nn.Parameter(orig_layer.weight)
        bn.bias = nn.Parameter(orig_layer.bias)
        bn.running_mean = orig_layer.running_mean
        bn.running_var = orig_layer.running_var
    m.__setattr__(child, bn)

# Fix the bn1 module to load the state_dict
resnet.bn1 = resnet.bn1.bn1

# Create reference model and load state_dict
reference = models.resnet50()
reference.load_state_dict(resnet.state_dict(), strict=False)

Note that I haven’t properly tested this use case, so let me know, if that works for you.


Thank you so much. About to test it now. Does this account for every resnet parameter, or will it leave some layers untrained? Is the final fc layer the only one that’s not accounted for by this?

It seems to work! Now, I would like to remove the avgpool, so I defined an Identity module and replaced it with that. However, now I don’t know what input size to define the subsequent linear layer with. Before, I could just do fc.in_features.

Is there a way to get the output shape of layer4[2].conv3?

Got it to work with num_ftrs = 2048 * 7 * 7 :slight_smile: Thank you!

Awesome, good to hear you got it working!
Let me know, if you see any issues with it. :slight_smile:

Just checking, could any of this interfere with loading and saving state_dicts? I’ve been having a weird bug with that.

What kind of error are you seeing?
Note that I’ve used strict=False while loading the state_dict to the reference model, as the FasterRCNN ResNet is missing the last linear layer.