RuntimeError: Expected 5-dimensional input for 5-dimensional weight [64, 3, 3, 7, 7], but got 4-dimensional input of size [3, 16, 112, 112] instead
Afte I change it to [64, 3, 3, 7, 7], I can export the model, but the training codes and the doc, both use [3,16,112,112], this is weird, why this happen?How could I use this model properly?
When the docs say [3,16,112,112] they are not including the batch size in those dimensions. You still need a batch size for the input so it would look more like this [64,3,16,112,112]. When the error comes back and says you need that shape [64, 3, 3, 7, 7] you can pretty much ignore that shape because it doesn’t matter. All you need to focus on is the number of dimensions in this case and just use the image dimensions from the docs.
Anybody knows what dimension needed in r3d18 model?
the required dimension is 5, I guess the first dim is batch.
Then, what is the last 4? I tried [Batch X frame X filters X W X H] but not worked… it looks like dim 2,3 and 4,5 has same groups…?
Pytorch video models usually require shape [batch_size, channel, number_of_frame, height, width]. We can verify it with PytorchVideo. As known, Pytorch Hubs provides many pre-trained models and how to use them. In this example, pre-trained model requires shape [batch_size, channel, number_of_frame, height, width].