How to get input shape of model?

Hello,

I load a model in torchvision module.

but how to get the model’s input shape?

Thank you so much for your help.

1 Like

If you are loading a standard model lets say VGG, you can very easily find the shape of the input tensor from a google search.
Most vision models have an input shape of 3x224x224(AFAIK)
But do check once…

I want to know ResNet3D-18.

torchvision.models.video.r3d_18

I am having the same problem coming from TF/Keras: for pytorch conv models i can`t tell what the input size should be when i use one implementation or another.

Relying on google searches seems awkward since the input tensor size should be a property of the network/architecture. I am worried might have missed something about using with torch.

Isn`t there any built in methods to query the input size or at least to help deduce it?

I even had a problem with the qubvel implementation of Unet w/ effnet encoder where the effnet-b2 encoder got shape mismatches for 260x260 imgs (the official size) but worked for 256.

Most of torchvision convolutional networks could work with different image sizes, except for perhaps this:

Important: In contrast to the other models the inception_v3 expects tensors with a size of N x 3 x 299 x 299, so ensure your images are sized accordingly.

This note is from torchvision.models documentation. So, to be save I would at least check for documentation.

On the other hand not all sizes a meaningful for the networks of different depths, because they could convolve images to much or to less thus getting not so optimal receptive fields.

Given that, I can defenetly recommend this github repo. There is a bunch of pretrained models and you can check the image size used for training models in the results spreadsheets here .
Hope it helps!