Classifying spatio-temporal data/videos

Subhankar_Ghosh · February 7, 2020, 2:43am

Hi!
I am a PyTorch newbie. As part of my research I need to build a video classifier and train it on a dataset on different kind of fluids(videos of fluid flows).
I have previously worked on image classification using PyTorch but I am a bit clueless about how to do the same on videos.
Can you all please suggest some resources/pointers that might be helpful?

Thanks!

christopherkuemmel · February 7, 2020, 8:24am

Hi @Subhankar_Ghosh,
For video processing you want to somehow retrieve the information of corresponding images. For example, this can be done by sequential models like RNNs or 3D CNNs (the extra dimension is for successive images). “All you need” is to wrap a image classifier within a sequential model (or extend one time dimension). Since you did work with image classifiers I’m sure you will get there.

I found these links repository to be helpful:

Good luck and have fun with your implementation!

Subhankar_Ghosh · February 7, 2020, 4:32pm

Hi @christopherkuemmel, thanks for the help.

Do you think the conv3d layers in Pytorch can be used for videos?

christopherkuemmel · February 7, 2020, 5:09pm

@Subhankar_Ghosh For sure. A Conv3d layer has the input shape of (N, C, D, W, H) where N is the batch size and D the depth of the images. In your case this would be the dimension where you stack the frames of your video.