Non Emprical way to find the right resize of the images

I am trying to find at what size I should be resizing the image without training the network.
I have images ranging from (500,500) to (4k,4k).
Right now, I am starting from (448,448) and then eventually will keep adding (224,224) in the images till some satisfactory convergence is observed.
But I am unable to find any logical deduction on how to reach to the preferred sizing without training the network first(which I don’t want to do because I have limited compute).

Hi, i found a recent approach named as bag of visual words. try using the approach to build a vocab out of the images and use these narrowed down images as nodes and work on a graph convoluion network.
So, that you do not need to work on entire image with so many pixels

1 Like