Pertained C3D model for video classification

Gkv · March 26, 2018, 4:16pm

Hi all,
I want to extract video features. Is there a pre-trained C3D (https://arxiv.org/abs/1412.0767) network available?.
Thanks

zhanghaoinf · March 27, 2018, 1:54am

Hi, Gkv,

There are more advanced I3D and P3D pytorch impementations.

P3D: Learning Spatio-Temporal Representation with Pseudo-3D Residual,ICCV 2017

I3D: Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, CVPR 2017

Gkv · March 27, 2018, 5:09am

Hi zhanghaoinf,
Thanks for your kind reply. I am very new to pytorch. Can you provide some links that tell how to use these kinds of implementations in my pytorch code. (I know how to load models using torchvision.models).
Thanks

zhanghaoinf · March 27, 2018, 7:29am

I have tested P3D-Pytorch. it’s pretty simple and should share similar process with I3D.

Pre-process: For each frame in a clip, there is pre-process like subtracting means, divide std.
An example:

import cv2
mean = (104 / 255.0, 117 / 255.0 ,123 / 255.0)
std = (0.225, 0.224, 0.229)
frame = cv2.imread(“a string to image path”)
frame /= 255.0 # [0, 255] → [0, 1]
frame -= mean # subtract means
frame /= std # div std
frame = frame[:, : , (2, 1, 0)] # covert BGR image to RGB image

A clip is a stack of frames with frame size H x W x 3 and clip size T x H x W x 3
since P3D requires input in form of 3 x T x H x W, perform:

clip = clip.permte(3, 0, 1, 2).contiguous()

Load P3D model
(P3D, Bottleneck, p3d_model_path can be found in the mentioned github.)

model = P3D(Bottleneck, [3, 8, 36, 3], modality = ‘RGB’)
p3d_weights = torch.load(“a string to p3d model path”)[‘state_dict’]
model.load_state_dict(p3d_weights)
out=model(data)
print(out.size(),out)

Gkv · March 27, 2018, 8:06am

Hi,
Thanks a lot for your fast reply. One more doubt, what are these modalities (RGB and Flow)?
Thanks

zhanghaoinf · March 27, 2018, 8:12am

RGB modality refers to frames which are directly filmed by camera.
Flow refers optical flow that calculated between adjacent frames, which reflect motion information.

Gkv · March 27, 2018, 8:20am

So in p3d, flow modality is for getting the motion description of the video and rgb is for the spatial features?

zhanghaoinf · March 27, 2018, 1:37pm

yes, you are right.
One more thing, optical flow can reflect motion patterns to some extents, but it is different from motion descriptor like dense trajectory. There exist some implementations to extract optical flow, such as fast-flow, brox-flow and TVL1. This part I didn’t study to much. Here are some references:
[1]. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks [Flow-Net v2] CVPR’17
[2]. Flownet learning optical flow with convolutional networks [Flow-Net v1] ICCV’15
[3]. Fast optical flow using dense inverse search [fast-flow] ECCV’16
[4]. TV-L1 Optical Flow Estimation, Image Processing’13
[5]. High accuracy optical flow estimation based on a theory for warping. [brox-flow] ECCV’04

Esam · October 25, 2018, 10:21am

Hi Zhanhaoinf,

Can the model run in single GPU easily? or does require multiple GPUs?

MaMaMaZige · November 14, 2020, 7:38am

hi, I meet the same question as you, and I find a pytorch implement of C3D. The link is following:
https://github.com/DavideA/c3d-pytorch

DeepLearner007 · September 16, 2021, 5:55pm

Greetings!
I am interested in replicating the C3D paper by Du Tran. The original repository is in caffe. I request you to please share the fast.ai or PyTorch implementation for the same.
I was able to find the following resources:
———————————————
PyTorch-Video-Recognition

Repository containing models lor video action recognition, including C3D, R2Plus1D, R3D, inplemented using PyTorch (0.4.0)
Trained on UCF101 and HMDB51 datasets
———————————————
Pytorch porting of C3D network, with Sports1M weights
Defining the C3D model as per the paper, not the complete implementation

———————————————

I am aware that developments have been made in the field of action recognition since 2015, but I am specifically interested in the C3D paper by FAIR.
Thanking you (all) in anticipation