What convolution to use (conv1d, conv2d or conv3d)

The image is represented with a stack of layers which are equidistant in depth (depth layers), and instead of feeding the whole image (every pixel) i want to feed R random pixels (from 0 to HxW). So I have an array of shape (N,D,C,R,1) where N is the batch, D is the number of layers, C channel and R is the number of random pixels. I want to combine the layers in a final image (N, 1, C, R, 1). What conv is best to use for this network?

Hi Pam!

Sampling random pixels seems like a bad idea because doing so
throws away the important two-dimensional structure that your image

If for memory or computation-time reasons you need to reduce the
amount of data in your images, I would recommend down-sampling
in a way that preserves that two-dimensional structure. This might
be as easy as using pytorch’s AvgPool2d or MaxPool2d (across
the HxW dimensions).

If you do that, I would then recommend applying Conv3d across the
depth and the down-sampled H and W dimensions.

To answer your original question, if you randomly sample pixels
independently from each depth layer, I don’t think any convolution
would make sense – there would be no “nearest-neighbor” spatial
structure left for the convolution to be applied to.

If instead you sample pixels from each depth layer using the same
randomly-chosen locations in HxW, you could conceivably use
Conv1d, applied across the depth dimension (because you have
preserved the nearest-neighbor structure when you move from one
depth layer to the next while keeping the location in your R dimension


K. Frank