I am interested in a simple CNN-based baseline for detecting the landmarks in an animal given I have groundtruth for the animal.
Here is a screenshot from a recent paper with this regard:
What would be a good starter code? I am trying not to poke into something complex like openpose/Simple Baseline by Microsoft (ECCV 2018), or DeeperCut. I am looking for something quite simple that can predict the landmarks in a supervised manner. It is totally cool though, if it could do some sort of transfer learning from human pose literature (if that doesn’t make the learning worse).
For starter, I have an annotated dataset of 800 frames and four landmarks.
Please let me know if you would be more interested in further information.
P.S.: I recently came across a paper that does domain adaptation for animal pose estimation from human pose estimation. I am not sure how much control I would have over implementing that though or how easy is it to generalize it to any animal. Cross-Domain Adaptation for Animal Pose Estimation