Keypoint localization

I am trying to create a Keypoint detection program(with 12 points). I would really appreciate if someone can guide me in which model will be the best suit for this. I tried custom CNN models but accuracy is a problem in them. So is there any state of the art model that I can utilize? Also if there is one can someone please provide a way to utilize it as well.

Thanks