Transfer learning from images to videos

If I have a labeled dataset of images for their facial expression (happy, sad, etc from a set of 8 emotions), and also a set of videoclips that don’t have labels:

  1. how can I learn the facial expression labels from these videos?
  2. can you please link me to a similar paper from the literature?
  3. overall, how can I predict emotion/facial expression for each of these short unlabeled videoclips (assuming there are better ways than transfer learning from AffectNet face images that have facial expression labels)?
  4. Is there a starter code in PyTorch that can get me acquainted with the concept through coding?
  5. I was looking for dataset that detect emotions based on a short videoclip and my search led my to DISFA dataset. However, after downloading the dataset, it does not seem to be the same exact thing I was looking for. Is there a dataset that has short videoclips <1 minutes and also their associated facial expression?

Please let me know if more information might be needed from my side.

Hii! I worked on similar topic (transfer learning from videos to images)
Here is a repo : Human Pose Estimation From Videos
There are some references provided
Hope it helps!