How to feed 2D images to conv3d layer?

I planned to use a model made of conv3d layers for video action classification but struck on how to feed the frames which were extracted from videos to the conv3d layers.

Conv3D wants in the form of batch_size x channels x depth x height x width. Your number of frames is the depth. Your channels can be RGB or whatever information you have on your frames. You may need to use a library to process raw videos into this format.

1 Like

Tk u, can you suggest any libraries??