How to change the input size for pretrained resnet model?

Zhiqiu_Lin · March 8, 2018, 7:09pm

I am new to Pytorch and I am following the transfer learning tutorial to build my own classifier. One thing I would like to know is how do I change the input range of resnet? Right now it is just taking images of size (224,224), but I would like it to work on images of size (512,512).

What is the best way to achieve this? Is it better to create my own network? And in specific, what layers do I need to change in order to make the network work?

I am a college undergrad using Pytorch to do my research, but I am new to deep learning and I just know some basic facts about resnet. If the input size is not possible change unless I modify the entire network, let me know!

tjoseph · March 8, 2018, 7:48pm

Look at what convolutional layers and pooling layers do. They work on any input size, so your network will work on any input size, too.
You only have to change the fully connected layers such as nn.Linear in VGG. Newer nets use a GlobalPooling layer before the fully connected layers and you do not even have to change the last linear layer.

Zhiqiu_Lin · March 9, 2018, 6:31am

But when I push images of size (512,512) other than (224,224) into my Resnet, it will have 'size mismatch" error at the AvgPooling layer. I am not sure whether that is the case because from documentation these layers seem to work for any shape.

If I change that AvgPooling layer to AdaptivePooling(1) then it will work, but it doesn’t look very efficient.

tjoseph · March 9, 2018, 9:42am

I don’t think there is much of a difference between AdaptiveAveragePooling and AveragePooling.
Just change the AveragePooling size from 7 to 16 and it should work, too.

chenwu · June 2, 2018, 3:41pm

Use PIL or similar libraries to resize the images to 224 x 224, then feed them to the pre-trained model should be OK.

Aitor_Arronte · November 16, 2019, 6:18am

You can just use torchvision.transforms.Resize ( size , interpolation=2 ) to get the desired dimensions.

Kristof_Horvath · June 22, 2020, 11:58am

Hey tsjoeph!
I arrived at this topic while I was working with FasterRCNN. I tried to reduce the input shape from 1024x1024 to 512x512 and it worked. Now I’m trying to understand why. Can you explain it a bit deeper, if you have some time?
Thank you!

Shubhankar · September 10, 2020, 10:04pm

Adaptive pool layers take care of the different image sizes.