I’d like to train a model capable of doing both instance segmentation and keypoint detection at the same time. Is there anything available off the shelf right now?
If not, how could I implement it myself? Torchvision has Mask-RCNN for instance segmentation and Keypoint-RCNN for keypoint detection. How could I, for example, extract the heads of Keypoint-RCNN and add it to a parallel branch of Mask-RCNN? And then train for a new case