How to load video data into 3D Conv

SamTX · September 17, 2019, 1:05pm

Hi,

I’m kind of new to PyTorch and would like to create a data loader for video inpainting. My data is currently in the following format, where I have folder for each video within which the video is broken down into frames.

-- vid 1
     -- img 0
     -- img 1
         ...
     -- img N

 -- vid 2
     -- img 0
     -- img 1
         ...
     -- img N

And I’d like to know how do I create a data loader to input into a 3D conv network, I believe the input tensor has to be [batch, channels, sequence, height, width],
Does this mean, in a single batch, I must go through the entire video? , just a different sequence ?

spanev · September 17, 2019, 2:21pm

Hi @SamTX,

Usually, you will subdivide your videos in sequence of some length.

You may want to take a look at DALI and its VideoReader operator, you can find here a small tutorial showing how to a video data pipeline and here how to easily plug DALI into PyTorch.