Dear Community,
I came across this phenomenon earlier this week:
import torch
import torchvision
model = torchvision.models.densenet121(pretrained=True)
model.eval()
im224s = torch.zeros(size=(1, 3, 224, 224))
im360s = torch.zeros(size=(1, 3, 360, 360))
im640 = torch.zeros(size=(1, 3, 640, 360))
im1920 = torch.zeros(size=(1, 3, 1920, 1080))
print("--- TEST: testing 224x224 ---")
x = model(im224s)
print("--- TEST: OK ---")
print("--- TEST: testing 360x360 ---")
x = model(im360s)
print("--- TEST: OK ---")
print("--- TEST: testing 640x360 ---")
x = model(im640)
print("--- TEST: OK ---")
print("--- TEST: testing 1920x1080 ---")
x = model(im1920)
print("--- TEST: OK ---")
Returns:
--- TEST: testing 224x224 ---
--- TEST: OK ---
--- TEST: testing 360x360 ---
--- TEST: OK ---
--- TEST: testing 640x360 ---
--- TEST: OK ---
--- TEST: testing 1920x1080 ---
--- TEST: OK ---
When I put images into my vanilla-pretrained-densenet121, that are bigger than what the actual input size should be (224x224), the model still returns an output.
I get that the input size doesn’t matter so much for the kernels and feature maps that can still be built, but the size of the feature map depends on the input image.
So if this works, using a bigger-than-expected image must lead to some form of information throw-away at least right before the fully connected layer in the end, which has fixed-size input.
So how are pytoch models handling over-sized images?
Cheers
tl;dr: the output volume of feature maps are calcuated like this:
Size=(Size_pre-Filtersize+2Padding)/(Stride+1), i.e. the size of a feature map is dependend on the size of the previous feature map and the filter-size. How is pytorch handling over-sized images? It does not throw an error.