Hi!
tl;dr: It is possible to make the bytes load and flow, but watch the network architecture.
I am trying something similar, to fine tune a ResNet outside Detectron (using SSL) and then plug it back in. I think I solved the “bytes are flowing” problem but there is no learning.
I also managed to load the pretrained backbone weights from torchvision, there is some learning but the performance drops significantly from pure D2 pretrained model.
Code is available here: demo_vissl_detectron2/detectron_vissl_2.ipynb at master · cristi-zz/demo_vissl_detectron2 · GitHub Skip the VISSL part and jump to " Finetune ResNet torchvision model and evaluate on Balloons"
I am tinkering A LOT now with trochvision and Detectron, reaching the conclusion that Detectron’s ResNet can’t be easily trained in SSL context (as opposed to torchvision’s ResNet from MaskRCNN).
ATM I abandoned Detectron2 path and trying to do segmentation using vanilla torchvision. World of pain there, too.
Hope it helps!