Video Feature Extraction

Hi all,
Can anyone suggest some pre-trained networks which can be used for video feature extraction, implemented in Pytorch?



If you use a CNN -> LSTM approach, I believe you can use one of the many pre-trained models for image classification. I also found this pre-trained 3d cnn for video classification.

I prefer a 3D CNN approach. I also came across the mentioned repo. But I am confused with the command line arguments. What are the --input and --output arguments?

python --input ./input --video_root ./videos --output ./output.json --model ./resnet-34-kinetics.pth --mode feature


Looking at the repository, it seems that input is a text file containing the name of the videos, like this one. The --video_root argument is the name of the folder containing these files. The output file is a json containing the results of the classification. All the arguments are explained here.

1 Like