Any example of how to use the video classify model of torchvision?
pytorch version : 1.7.1
os : win10 64
Trying to forward the data into video classification by following script
import numpy as np
import torch
import torchvision
model = torchvision.models.video.r3d_18(pretrained=True, progress=True)
model.eval()
img = torch.zeros((16, 3, 112, 112))
results = model(img)
I got error messages
RuntimeError: Expected 5-dimensional input for 5-dimensional weight [64, 3, 3, 7, 7], but got 4-dimensional input of size [3, 16, 112, 112] instead
Afte I change it to [64, 3, 3, 7, 7], I can export the model, but the training codes and the doc, both use [3,16,112,112], this is weird, why this happen?How could I use this model properly?
Thanks