I am attempting to train an LSTM for predicting video frames. The frames will be represented as feature vectors pulled from a the outputs. I am trying to get the 2048 size features right before classification, but am struggling to do so. When I use IntermediateLayerGetter, I recieve this error, when trying to run the new model.
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [192, 768, 1, 1], but got 2-dimensional input of size [1, 1000] instead
Any hints as to what I am doing wrong or how to get these features?