Pretrained sizes 244

bluesky314 · January 24, 2019, 12:34pm

If I am using resent models but removing the fc layers I don’t need to resize my image to 244*244 right? If I were to are there chances of lossing any accuracy?

chenglu · January 24, 2019, 12:46pm

First things first the size is 224 not 244, and I think you should also remove the avgpool layer or just use the nn.AdaptiveAvgPool2d , the input size you choose is recommended to be a n * 2^5 (n >= 1).

bluesky314 · January 25, 2019, 2:22am

Can you explain why it is recommended to be such?

chenglu · January 25, 2019, 2:42am

There are together 5 “layers” in ResNet. The size of the feature map will be downsampled by x2. That’s why the output of last layer (before the pooling layer) of the original input size is 7x7 (which is 224 / 2^5).

If you chose 255 as your input size, the downsampling layer, which is actually a Conv2d with a stride 2, will skip the last column and row of the feature map, which cause loss of information.

From conv2d doc

Depending of the size of your kernel, several (of the last) columns of the input might be lost, because it is a valid cross-correlation, and not a full cross-correlation. It is up to the user to add proper padding.