Changing the backbone for FasterRCNN

kiostra · December 9, 2019, 11:10am

Hello,
I am new to pytorch and I wanted to try different backbones for FasterRCNN. I am following this tutorial: https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html#defining-your-model
This is my model

def my_model(num_classes):
    backboneV3= torchvision.models.inception_v3(pretrained=False)
    backboneV3_nofc  = nn.Sequential(*list(backboneV3.children())[:-1])
    backboneV3_nofc.out_channels = 192
    anchor_generator = AnchorGenerator(sizes=((32, 64, 128, 256, 512),),
                                       aspect_ratios=((0.5, 1.0, 2.0),))

    roi_pooler = torchvision.ops.MultiScaleRoIAlign(featmap_names=[0],
                                                    output_size=7,
                                                    sampling_ratio=2)
    modelV3 = FasterRCNN(backboneV3_nofc,
                       num_classes=num_classes,
                       rpn_anchor_generator=anchor_generator,
                       box_roi_pool=roi_pooler)
   
    return modelV3

I give the data during training exactly as in the tutorial but I always get this error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 192 768, but got 2-dimensional input of size [2, 1000] instead
What should I change in the model?

mailcorahul · December 9, 2019, 5:49pm

You should ideally pass a tensor of shape B x C x H x W, B=batch_size, C=channel, H=height and W=width.
Assuming the exception above is occurring at the input layer, it looks like you are passing a tensor of shape 2 x 1000. Can you verify the shape of your input tensor?

kiostra · January 9, 2020, 11:00am

So if I do:

backboneV3= torchvision.models.inception_v3(pretrained=False)
backboneV3.eval()
x = torch.rand(1,3, 299, 299)
backboneV3(x)

this works fine. But if i try to slice the model in order to add the FasterRCNN it does not work anymore:

backboneV3_nofc  = nn.Sequential(*list(backboneV3.children())[:-1])
backboneV3_nofc.eval()
backboneV3_nofc(x)

I feel a bit lost…how should I slice the model?

ptrblck · January 10, 2020, 2:25am

In the tutorial, the backbone is created using model.features.
This is unfortunately not possible in the current Inception implementation, since some functional calls are performed in the forward method as seen here.
These functional calls also make slicing the model and wrapping by nn.Sequential not possible.

A possible workaround would be to return the features, by replacing the last linear layer with nn.Identity:

model = models.inception_v3(aux_logits=False)
model.fc = nn.Identity()

This would not slice the model in any way, keep the forward definition, and just return the penultimate activations.

Let me know, if this would work.