I am looking to implement a model using a ‘pseudo-3D approach’, similar to this:
https://www.groundai.com/project/evaluation-of-multi-slice-inputs-to-convolutional-neural-networks-for-medical-image-segmentation/1
Specifically, I am working with medical data and want to add adjacent imaging slices (above and below the center slice) as contextual information in order to improve 2D segmentations on the center slice, i.e. a pseudo-3D approach (as the paper calls it). The paper in question adds these neighboring slices as additional channels in the input tensor. My question is - how do I organize this data in the input tensor such that actual segmentations are only being performed on the center slice? This approach appears to be quite different than a normal multi-channel input image, which may just have different color channels.
My thought was that if my input is CxDxHxW, that the first channel would be my center slice (image of interest that I want a predicted segmentation for), and that any additional channels would be the adjacent slices to provide contextual data. Does this sound correct?