Size mismatch error when running Transfer Learning tutorial

Resnet is built to only take images of size 224x224.

Resnet is built of several layers of convolutions and poolings after which the image of three colours and size 224x224 is transformed into an “image” of 512 features and size 1x1. This is then fed into a Linear layer that needs an input with 512 features.

If you input an image that is bigger than 224x224 then the convolution and pooling layers will transform it into an “image” with 512 features that is bigger than size 1x1 and there will be too many features for the Linear layer. For example, if the convolution and pooling output is of size 2x2, then that would make 2048 inputs for the Linear layer.

If you input an image that is smaller than 224x224, then the average pooling layer will complain about its calculated output size being negative.

Basically Resnet can’t be used with images that aren’t 224x224 unless you crop or rescale them so that they are 224x224.

1 Like