I have a pretty challenging issue that I have been running into the past week. I have to get my images in the format of a numpy array. However, my model takes in images of different sizes, meaning the dimensions are different. I would like to get everything in the numpy array without changing the predictions. Right now, I can resize all images with a transformation to the same size and add to the numpy array and this works, but it changes the predictions of some of the images. I have also tried adding whitespace to the area when expanding a photo, but this also changed the predictions. Any ideas for how I can go about this?
I have tried padding tensors as well to be the same size. This does not give an error, but it goes to the largest image’s dimensions, which I found that past a certain point, it completely changes the prediction of the image because there is too much white space and every prediction is the same.
Depending on the type of the model, you may not need to resize the images if you are OK with running it witch batch size 1. For example, many architectures use adaptive average pooling before fully connected layers which means that different resolutions are usable as input if they are not batched together.