Ask about VGG-M features extraction

Hallo everyone,

I’m trying using VGG-M extract video featurs, but I have some problems.

  1. I am supposed to use CNN extract features, the CNN ingets 5 frames sliding windows, which means the input of the network is 5 images(in grayscale). But I have downloaded the VGG-M model, in this model the input is settled for a 3 channel image(RGB). I have try to change the input dimension, but I got the error, which said the dimension mismatching. I want to ask how can I build a 5 channel convolutional filter at conv1 and then use the VGG-M model?

  2. I tried extract 3 channel image(RGB), after model.features the output is (1,512,3,3), but I saw the code in, if I use classif, the input should be 18432, but 51233 is not 18432, I also tried to change the dimension, but still got the problem with dimension mismatching

Thank you for your help. I greatly appreciate it.