Using a custom finetuned model with detectron2

Hi all, not sure if this is the correct place to ask this but hopefully someone can shine a light on this.

I started off by creating a classifier with Pytorch based on resnet50.
Following this, I created a separate object detection model using detectron2 using "COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml" as my pretrained model.

I want to know if there is a way to use the previous classification model I have already created as the base for my object detection model. Am I just thinking about this wrong? Is there a better way to do what I am trying to do?

Further context: there are multiple classified images for classes but only one of the classes has segmented data available (with the masks and bbox). Eventually, I would like to be in a position where all classes can be detected by a camera in real-time with bounding boxes however the data is not there currently. Please let me know if any other approaches should be considered/investigated.


Yes, you should be able to use your custom model as the backbone, at least I don’t think there should be any fundamental limitations to it as long as all shapes match the overall workflow.
I haven’t checked the code deeply, but e.g. this might be a good place to use your custom model.


Hi @ptrblck, thanks for your reply. At the moment, this is what the prototyped train code looks like, which is available in one of the examples. I was wondering if there was a more direct approach to change out the model since it is passed as an argument into merge_from_file. Are there any resources you are aware of how I can make existing .pth model weights compatible with what detectron2 wants to be inputted.

Additionally, I am thinking whether the existing instance segmentation models available here may provide some valuable foundations alongside the classifier I have already built. Do you have any ideas for any sort of combinations I could perform? I’m very new to the field, so thank you for your time.

pretrained_model = "COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml"
cfg = get_cfg()
cfg.DATASETS.TRAIN = ("my_train_folder",)
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(pretrained_model)  # Let training initialize from model zoo
cfg.SOLVER.BASE_LR = 0.00025  # pick a good LR
cfg.SOLVER.MAX_ITER = 500    # 300 iterations seems good enough for this toy dataset; you will need to train longer for a practical dataset
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64   # faster, and good enough for this toy dataset (default: 512)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # only has one class (ballon). (see
# NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg) 

No, I’m unfortunately not familiar with any resources, which would show how to replace the backend.
I think you could also create an issue in the Detectron2 repository to get better guidance from the authors of the model.

Hi umerhasan17, I am looking to do the same thing that you were asking about. Please let me know if you managed to do it.


tl;dr: It is possible to make the bytes load and flow, but watch the network architecture.

I am trying something similar, to fine tune a ResNet outside Detectron (using SSL) and then plug it back in. I think I solved the “bytes are flowing” problem but there is no learning.

I also managed to load the pretrained backbone weights from torchvision, there is some learning but the performance drops significantly from pure D2 pretrained model.

Code is available here: demo_vissl_detectron2/detectron_vissl_2.ipynb at master · cristi-zz/demo_vissl_detectron2 · GitHub Skip the VISSL part and jump to " Finetune ResNet torchvision model and evaluate on Balloons"

I am tinkering A LOT now with trochvision and Detectron, reaching the conclusion that Detectron’s ResNet can’t be easily trained in SSL context (as opposed to torchvision’s ResNet from MaskRCNN).
ATM I abandoned Detectron2 path and trying to do segmentation using vanilla torchvision. World of pain there, too.

Hope it helps!