How to handle different input image sizes?

Say I have an image model, how can I handle different image sizes?

Should I generate multiple onnx models for different sizes?

Do I do resizing or padding?


The typical approach is to simply crop and resize to the trained model’s input size (so that the scale of the objects in terms of the number of pixels is close to what was seen at training time). You can generate different models for different size, but you still need to account for the scale issue; a model trained at 224x224 will start losing accuracy without finetuning around 400x400 resolution because of this scale mismatch problem.

[1906.06423] Fixing the train-test resolution discrepancy is a good paper for more details on how to handle resizing