Hi, I am playing with the pre-trained Resnet101 in torchvision. I tried different input size of images (224x224, 336x336, 224x336) and it seem all works well. So what’s the exact valid range of input size to send into the pre-trained ResNet?
I think the valid input size of images is 224224. May be you are using preprocessing in your code and whatever the input size of the image is, it crop as just 224224. Thanks.
I printed modules in the ResNet and found why:
The AvgPool before the last FC is like this:
(avgpool): AvgPool2d (size=7, stride=7, padding=0, ceil_mode=False, count_include_pad=True)
Therefore as long as the input image size makes the AvgPool output tensors of size 1x2048x1x1, there is no problem. But if the input size is not 224x224, it is cropped by ResNet implicitly at AvgPool layer.