what is the input shape of pytorchvideo r(2+1)d model?
Based on these docs I would assume the input shape to be
[batch_size, nb_channels=3, seq_len, height=224, width=224].
Thank you very much. I almost gave up on this version.
It works for the following shape
[B, 3, seq_len = 16, 224, 224]
Initially, I got confused about two things. The document in pytorchvideo shows their mode with ResNet shape of (B, 3, 8, 224, 224). And “Pytorchvision.models” r2plus1D works with an input shape of [B, 3, 16, 112, 112].