CNN input image size formula

iacob · March 10, 2021, 2:17pm

A network whose first layer is Conv2d will have an input size of (batchsize, n_channels, height, width).

Since Convolutional layers in PyTorch are dynamic by design, there is no straightforward way to return the intended/expected height and width, and in fact (subject to remaining a valid size after unpadded convolutions and poolings etc), any image size may be acceptable to a module composed completely of convolutions.

If the network subsequently contains an e.g. Linear layer with a fixed input size parameter, any image with size (height/n, n*width) should be acceptable input to the network (subject to the same above conditions, where height and width were the intended dimensions of the input).