Using Certain Layers of Mask RCNN/Faster RCNN

Hi!

I am trying to use the pretrained Mask RCNN model directly from PyTorch. However, I would like to separate the backbone part from the RPN and the head part, i.e, I want to create two models from the initial model. I have seen many people ask this but, could not find a solution:

If I divide the model using modules = list(model.children()) and try to create an nn.Sequential() using this, I get this error: TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not tuple.

Some people also faced this issue, but I couldn’t solve this. Some examples are:

https://stackoverflow.com/questions/59816287/how-to-remove-certain-layers-from-fastern-rcnn-in-pytorch?answertab=oldest#tab-top

https://stackoverflow.com/questions/56455302/use-only-certain-layers-of-pretrained-torchvision-network

The problem is caused due to the GeneralizedRCNNTransform() returning an torchvision.models.detection.image_list.ImageList object.

Would be grateful for any kind of hints/help.

Thanks in advance.

nn.Sequential is a container used for simple feedforward models.
In case of more complicated models, you would have to be careful to not lose any functional API calls in the original forward method.

That being said, I would recommend to either derive your custom class from the base class and manipulate the module as you wish or alternatively create a custom model class and pass in the desired submodules from the pretrained model.