Any example of how to use the video classify model of torchvision?
pytorch version : 1.7.1
os : win10 64
Trying to forward the data into video classification by following script
import numpy as np import torch import torchvision model = torchvision.models.video.r3d_18(pretrained=True, progress=True) model.eval() img = torch.zeros((16, 3, 112, 112)) results = model(img)
I got error messages
RuntimeError: Expected 5-dimensional input for 5-dimensional weight [64, 3, 3, 7, 7], but got 4-dimensional input of size [3, 16, 112, 112] instead
Afte I change it to [64, 3, 3, 7, 7], I can export the model, but the training codes and the doc, both use [3,16,112,112], this is weird, why this happen?How could I use this model properly?