Spectral and inter-channel networks

Hi everybody!
I start this topic with the hope of having and enriching conversation about possible architectures, specially thinking about remote sensing segmentation applications on medium resolution, here’s the framing:

In many settings (say, mapping forests at 20-30m resolution) in remote sensing, the spatial context appears to be not-so-relevant, and the spectral information (pixel values) becomes much more important. So is the case, that it’s very common in remote sensing to use vegetation and other derived indices (such as NDVI). Such features are usually extracted beforehand.

It is also common at these resolutions to use multispectral instruments with ~12 bands, and even combine different sensors.

We could think intuitively then, that a network that could convolve across the channel dimensions to exploit some of the inter-band relationships, but 3D convolutions are usually expensive.

So here comes the point: what are good ways of exploiting this spectral information better, and efficiently?

I’m currently doing a literature review on this, and will be happy to post my findings as I advance if there’s interest, but would really love to hear people’s thoughts on this.