Keypoints detection program

Hello,
I am looking for two keypoints detection programs :
1 : one with one point per class
2 : the other with any number of points per class
Both programs must accept big images (1500x 1000 pixels) without resizing before to enter in the neural network.
Thank you
Best regards

noon has an answer ?

Hi,

You can use the Keypoint RCNN from torchvision: Keypoint RCNN

You can specify the number of keypoints that you want and in case you have multiple values for the number of keypoints you can choose the maximum possible number and when preparing the data you can append zeros in case if you have a lower number of keypoints than the maximum .

You can also refer to this tutorial to understand more about how to deal with Mask RCNN architecture and how to prepare a dataset for it including bounding boxes as they are necessary to train this model.

Thank you very much ! I will try

I saw your tutorial but it can only be one point per class and per polygon, so it answers half of the question, for example I want to detect the teeth of a leaf, so there are a variable number of points of the same class per polygon(=leaf)
thank you

you don’t know a program to do this ?

Hi,

Sorry for not replying earlier as I didn’t receive a notification.

one point per class and per polygon

What do you mean one point per class? You can specify whatever number of keypoints you want to estimate within each bounding box -in your case ‘a leaf’- when you call the RCNN constructor. Of course it’s the case as long as you have training data that has the same nature as the testing samples.

there are a variable number of points of the same class per polygon(=leaf)

As I have suggested, you can assume that every leaf has the maximum possible number of teeth and whenever it’s not the case i.e. the leaf has lower number than the maximum, you can just assume that they are there but not visible and duplicate the keypoints until you reach the maximum. At least, this approach is working fine for me.

no in COCO keypoints format there is only one keypoint per keypoint name (= class) and per polygon, if I imagine keypoints names “teeth1”, “teeth2”, etc it wouldn’t work as teeth1 and teeth2 for two leaves have no correlation

Ok it’s a bit difficult for me to understand your point. Could you please describe the dataset that you have for training?
In my understanding, you have a set of RGB images that contain leaves and each leaf has a number of keypoints -teeth- that varies for each leaf e.g. 20, 100, 150, 500 and so on, and each keypoint has an x,y values that correspond to the pixel location on the image.
If you don’t have such dataset, then you can’t train the Keypoint RCNN model to do this task afaik.

yes I have a such dataset but what do you think I should put in keypoints_names?

What I want to say is that I have only one keypoint_name : teeth (one class) and several keypoints and if I put teeth1 teeth2 and so on in keypoints names it can’t works

Why do you need names? :sweat_smile:
Keypoint RCNN doesn’t care about names of keypoints as long as you formulate the data correctly. Please refer to this tutorial for more information. https://debuggercafe.com/human-pose-detection-using-pytorch-keypoint-rcnn/

1 Like

it does ! for example :
This model has been pre-trained on the COCO Keypoint dataset. It outputs the keypoints for 17 human parts and body joints. They are: ‘nose’, ‘left_eye’, ‘right_eye’, ‘left_ear’, ‘right_ear’, ‘left_shoulder’, ‘right_shoulder’, ‘left_elbow’, ‘right_elbow’, ‘left_wrist’, ‘right_wrist’, ‘left_hip’, ‘right_hip’, ‘left_knee’, ‘right_knee’, ‘left_ankle’, ‘right_ankle’.

for example left_eye is a keypoint_name

This is just to identify what every keypoint represent for visualization purpose. It also helps to create edges between them. I train this model to identify 778 keypoint for the hand not knowing which is which. No need for names!

but in this case, how it recognizes which point is of what name if it does not differency them ?

I will test, how many iterations is good for training please ?

but in this case, how it recognizes which point is of what name if it does not differency them ?

Depending on the order of points while training

how many iterations is good for training please ?

According to my experience, very few number of epochs should be sufficient. 1 epoch can be enough, more than 5 could result in overfitting.