Creating a CNN with bi-temporal image data

I’m looking into ways to create a convolutional net that can classify whether deforestation has occurred in an image based on a prior image of that same area. My dataset is a large set of pairs of image tiles of a general area in the Amazon at a time t1 and later time t2. Each image pair shows the same location on a map. Anyone have advice on how I could best approach this?

Keeping things simple, I define deforestation as: if a pixel has a lighter shade of green or goes from a shade of green to a different color in the second image compared to the first. If this is the case, the pixel could be assigned -1. +1 could be assigned if the opposite is true and 0 if no change has occurred. Then, I’d aggregate this over all the pixels in the 256x256 pixel image for a classification for the image as a whole.

Conceptually, I feel like this shouldn’t be too hard to implement but I’m pretty new to all this so any help would be awesome :slight_smile:

Quick hints:
Classification stands for a label for the whole input. If you want to compute per-pixel.

If you have a labeled dataset you can do segmentation easily with a U-Net.
If you have images classified as def and no def you can use metric learning (google it) for a model to bring no def vs no def images together and no def vs def appart.

Lasty you can prob pose it as a classification problem passint both images as input.