Variable number of input images into CNN

Johnny_Cash · March 2, 2023, 10:29am

Is it possible to input a variable number of images into a CNN? I would like to pool information from a stack of images. I suppose this is possible for a fixed number of images - for 5 images for example input a tensor with depth 15 because of RBG and then maybe use Conv3D and MaxPool3D to reduce dimension. Is this somehow possible for non static number of images or does the number of images that has to be put in fixed?

I suppose this should work somehow because for example you could write a network - a fully convolutional one - where you can input random size of images, so you remove the limitation along the 2d axis basically.

J_Johnson · March 2, 2023, 10:48am

You can always alter the batch_size. Otherwise, you can use Conv3d layers and stack your images so they are of shape batch_size, channels, num_images, dim1, dim2. Then just alter the kernel_size, stride, dilation, etc. to keep that dim size the same. But you would not be able to stack batches of images of different number of images in this method. Also, you would have learned relationships between the the images stacked, and so order of the images would become relevant.

Johnny_Cash · March 3, 2023, 10:35am

Ok, so one batch would have sets with each same number of images. And the next batch could just have sets also with the same number of images within the batch.

J_Johnson · March 3, 2023, 12:34pm

You can send into a model any batch_size your memory can handle, other than that, there is no limitation on the batch_size between back-propagation. Granted, your learning rate might need an appropriate adjustment to match the batch_size and avoid overfitting to smaller batches.

Johnny_Cash · March 3, 2023, 12:52pm

That’s clear. I meant something different, but nevermind I figured it out. Thanks!