Size of input images/tensor to mobilenet and other pretrained models

I am relatively new to transfer learning and I have been working through tutorials on pytorch’s webpages. I notice that they all say that the minimum size they expect is 224x224 and I get why - the kernels are trained on 224x224 to extract valuable features.

However, I just tried to use mobilnetv2 with a dimension of 128x128 and it worked fine and the results are also impressive. To be clear, my input tensor is of size 3,128,128. Why? Shouldn’t it error out since the minimum dimension size is not met?

Does that now mean that if I manage to feed it 224x224 (memory was an issue and hence I downsized it) the result ill be better?

I guess the last pooling layer in your implementation is AdaptivePool, which could deal with any size and the output is (b, c, 1, 1). So, it is ok to use 112x112.

Besides, because the Mobilenetv2 has a downsample rate of 32 , your input size is OK when it is larger than 32.

224x224 is a commonly used size for imagenet classification task and I think 224 is very likely better than 112.

Thank you @Alpha for your reply. It makes sense.