Impact of input image resolution on inference

pramod.srinivasan · February 23, 2019, 8:44pm

Maybe I am missing something obvious, but if a pre-trained network’s input image size is 300x300 or 500x500, how would the performance of the classification/segmentation inference be affected when I feed a oversampled image from 224px.

While I did observe differences in outputs, are there experiments which can be done to understand how these perturbations in image sizes can affect the model inferene.

rasbt · February 23, 2019, 9:06pm

Upsampling makes results typically worse than downsampling, based on my practical experience. However, since this is just a factor of ~2x, this shouldn’t be a big issue.

To avoid rescaling issues altogether, you could also just choose architectures that are agnostic to the input dimensions, e.g., fully-convolutional networks (avoiding FC layers). Or, just add a spatial pyramid pooling layer before the FC layer(s).