ResNet34 Keypoints Detection

Hi there,

I am trying to use the reg_head for a resnet34 Model applied as included down here, where pictures of 384 by 288 pixels were used, but I do not understand where the values (64 * 12 * 9) and (6144) come from.

In this example, the keypoints number was 12, how to reflect this on my model if I am going to detect only two keypoints resizing my images to 224*224 pixels?

Link to the ex.: https://towardsdatascience.com/hand-keypoints-detection-ec2dca27973e

head_reg = nn.Sequential(
nn.Conv2d(512,64,kernel_size=(1,1)),
nn.BatchNorm2d(64),
nn.ReLU(),
Flatten(),
nn.Linear(64 * 12 * 9, 6144),
nn.ReLU(),
nn.Linear(6144, 24),
Reshape(-1,12,2),
nn.Tanh())
learn = create_cnn(data, arch, metrics=[my_acc,my_accHD], loss_func=F.l1_loss, custom_head=head_reg)

Mabye you can just output two features at the end of the model.

I’m trying to make a keypoint detection with the tutorial https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html,
I changed some code to get keypoint detection like following:


def get_model_instance_segmentation(num_classes=2, num_keypoints = 4, pretrained=True, mask=False):
    from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
    from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor
    from torchvision.models.detection.keypoint_rcnn import KeypointRCNNPredictor

    # load an instance segmentation model pre-trained pre-trained on COCO
    if mask:
        model_template = torchvision.models.detection.maskrcnn_resnet50_fpn
        model = model_template(pretrained=pretrained,
            num_classes=num_classes)
    else:
        model_template = torchvision.models.detection.keypointrcnn_resnet50_fpn
        model = model_template(pretrained=pretrained,
            num_classes=num_classes, num_keypoints=num_keypoints)

    # get number of input features for the classifier
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    print("In Features: ", in_features)
    # replace the pre-trained head with a new one
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

    if mask:
    # now get the number of input features for the mask classifier
        in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels
        hidden_layer = 256
        # and replace the mask predictor with a new one
        model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask,
                                                        hidden_layer,
                                                        num_classes)
    else:
        in_features_keypoint = model.roi_heads.keypoint_predictor.kps_score_lowres.in_channels
        # print("InFeatures: ", in_features_keypoint, model.roi_heads.keypoint_predictor.out_channels)
        model.roi_heads.keypoint_predictor = KeypointRCNNPredictor(in_features_keypoint, num_keypoints)

    return model

you may take a look at the source code of torchvision.models.detection.keypoint_rcnn.

You can check out my code for fine tuning the keypointrcnn_resnet50_fpn model for detecting 2 keypoints.

code for training
It works!

I am having issues with the boundingbox did you insert the x0,y0, width, height or xmin, ymin, xmax, ymax? I keep getting this error: All bounding boxes should have positive height and width. Knowing that all my bounding boxes are correct.