Predicting Digital Surface Models (DSM) with one channel

Hello community,

I recently build a Unet for predicting Digital Surface Models (DSMs or heightmaps) with a Unet. For this my input has a shape of 3x512x512 for the RGB channels of my satellite image while my ground truth is a DSM with 1x512x512 respectivley. Everything works fine and the losses looking good. But I am wondering of some metrics to measure something like accuracy or dice loss.

The task of course is not really a classic semantic segmentation because the resulting DSM image contains not 1 for segmented and 0 for the rest, it contains the corresponding height of each pixel. Thus it is useless to have an accuracy where you like compare the prediction DSM with the traget one pixel by pixel, because it will almost never have the EXACT height (based of float) isn’t it?

My idea was to have like a 10 percent range for the element wise comparison, but I dont if this can be done very efficiently. But maybe this is a known problem…?

Small example:
Target-heights: [20.5435, 23.6946
19.59 , 22.646 ]

Prediction-heights: [18.9545, 23.5007
20.0153, 26.5432 ]
This is just a made up one, I do not know how close the values actually will get… Also is there a way to write nice matricies in a question?