How to convert a fully supervised landmark estimation model to a semi-supervised one?

Imagine that I have a model that can do 2D landmark predictions for a moving deformable object (e.g. facial landmarks). How can I change this model such that it would work in a semi-supervised fashion in which I only have annotation for 500-1000 frames but I want the algorithm to make annotations on the rest of frames (in which rest of the frames don’t have groundtruth)?

  1. How the current model in the link should be changed?
  2. What evaluation score should be used since we don’t have the grountruth for rest of the frames and we can’t use something like MSELoss?
  3. Assume that the 500-1000 frames to be annotated is selected via a clustering algorithm such as K-means.

Here’s an example a fully supervised 2D facial landmark estimation problem:

1 Like