I am doing some researches on lip reading and need to extract the landmarks of the mouth. However, the available tech I know like face++ https://console.faceplusplus.com/documents/5679127 can only get the landmark of the complete face or It will complain that cannot detect the human face. My samples look like, which are incomplete human face, only the area of mouth are available.
One idea would be to retrain some model like MTCNN on available face / landmarks databases (with more landmarks than the original MTCNN of course), but this time cropping the image to your kind of input and discarding the landmarks you wouldn’t use. It would take minimal code adjustment but heavy preprocessing of the training images…