Suppose I have converted videos to frames and the folder structure is such
So now I can use the ImageFolder to load the frames per the classes. But training models frame-wise seem to perform very poor for me. So I want to stack all the frame per video and assign a single label against it. What would be an easy way to achieve this?
It seems your pain point is that training is slow. Are you using batch data to train? Every batch will consist of different training frames, and they can go through your neural network in one pass.
What I meant was I need a way to keep track of the video frames from a certain video when I load in the dataset instead of each of them being a seperate entitiy.
Alternately, is there something like the keras SlidingFrameGenerator in pytorch?
You might want to checkout SequentialSampler, which is related to data loader: https://pytorch.org/docs/1.1.0/_modules/torch/utils/data/dataloader.html
My experience with SequentialSampler is that it will keep your training data sequence, so that the video frame sequence can be reserved during training.