If I am using resent models but removing the fc layers I don’t need to resize my image to 244*244 right? If I were to are there chances of lossing any accuracy?
First things first the size is 224 not 244, and I think you should also remove the
avgpool layer or just use the
nn.AdaptiveAvgPool2d , the input size you choose is recommended to be a
n * 2^5 (n >= 1).
Can you explain why it is recommended to be such?
There are together 5 “layers” in ResNet. The size of the feature map will be downsampled by x2. That’s why the output of last layer (before the pooling layer) of the original input size is 7x7 (which is 224 / 2^5).
If you chose 255 as your input size, the downsampling layer, which is actually a Conv2d with a
stride 2, will skip the last column and row of the feature map, which cause loss of information.
From conv2d doc
Depending of the size of your kernel, several (of the last) columns of the input might be lost, because it is a valid cross-correlation, and not a full cross-correlation. It is up to the user to add proper padding.