Hi every one,
i’ve done several projects on deep learning based image processing (classification,…) but i am new with video processing in pytorch.
I want to analyze video sequences, for example activity recognition and i have faced problems in preparing dataset for this task .
I have a training set of different actions in sub folders and i want to produce a sliding window (ex: 10 sequence of frames) to feed to the model ( i want my input size to be 10227227). how can i do this.
the frames order is important and i don’t want the frames to be got randomly!!
Hi every one,
Have a look at this post.
In this topic the user was dealing with different activities performed by different persons.
Using a custom sampler, you could use a sliding window approach by providing the “invalid” frame indices, where the window should not grab images from.
Would the code example be suitable as a starter code?
PS: Tagging certain people might discourage others to answer in your thread.
thank you very much .
i did as you recommend .
I first found that post which was quite close to mine, but how about defining a dataset class?!
Do you want to feed only 10 frames from each video to the model?
thank you very much for helping.
i have a training set composed of 13 activity subfolders for example, and each activity sub folder contains 200 frames .
i want to feed my data to the model as this:
batch n: img(n) , img (n+1),…imge(n+9)
for each class of activities
how should i define the dataset class?
Got it, thank you for explanation.
I dove into this thread cause I’m working with some kind of action recognition. But in my case I’m feeding network with a whole sequence at a time. As for you, I almost sure you should look towards Sampler.
Thank you very much for all, for your great comments and help
i really appreciate it, im trying to dive into sampler to get the comments precisely and solve it
can you please explain me to how we use
collate_fn in dataloader. because I have 20 sentences for one image and need to select a random sentence for train decoder.