Hello,
I load a model in torchvision module.
but how to get the model’s input shape?
Thank you so much for your help.
Hello,
I load a model in torchvision module.
but how to get the model’s input shape?
Thank you so much for your help.
If you are loading a standard model lets say VGG, you can very easily find the shape of the input tensor from a google search.
Most vision models have an input shape of 3x224x224(AFAIK)
But do check once…
I want to know ResNet3D-18.
torchvision.models.video.r3d_18
I am having the same problem coming from TF/Keras: for pytorch conv models i can`t tell what the input size should be when i use one implementation or another.
Relying on google searches seems awkward since the input tensor size should be a property of the network/architecture. I am worried might have missed something about using with torch.
Isn`t there any built in methods to query the input size or at least to help deduce it?
I even had a problem with the qubvel implementation of Unet w/ effnet encoder where the effnet-b2 encoder got shape mismatches for 260x260 imgs (the official size) but worked for 256.
Most of torchvision convolutional networks could work with different image sizes, except for perhaps this:
Important: In contrast to the other models the inception_v3 expects tensors with a size of N x 3 x 299 x 299, so ensure your images are sized accordingly.
This note is from torchvision.models
documentation. So, to be save I would at least check for documentation.
On the other hand not all sizes a meaningful for the networks of different depths, because they could convolve images to much or to less thus getting not so optimal receptive fields.
Given that, I can defenetly recommend this github repo. There is a bunch of pretrained models and you can check the image size used for training models in the results spreadsheets here .
Hope it helps!