Binary Segmentation. Representation of different classes

Augustas_Macys · March 23, 2021, 10:34pm

In binary segmentation, should the mask be represented as a matrix of 0s and 1s where 0 is one class (background) and 1 is another class. Or should it be represented as matrix of 0s and 255s where again 0 is one class (background) and 255 is another class. Or it does not matter?

KFrank · March 24, 2021, 1:41pm

Hi Augustas!

The short answer is yes, the mask should be 0s and 1s (and, yes,
it does matter).

Specifically, when you use pytorch’s BCEWithLogitsLoss (or its
numerically-less-stable cousin BCELoss), the target (mask) you
pass in should be a floating-point tensor of probabilities that range
from 0.0 to 1.0.

When your ground-truth target has complete certainty, it would
have a value of 0.0 for the background – 0% probability of being
a foreground pixel – and a value of 1.0 for the foreground – 100%
probability of being a foreground pixel. However, by not restricting
the target to have only the values 0.0 and1.0, we admit the
possibility that the target can be probabilistic. So, for example,
a value of 0.75 would indicate that a pixel is probably a foreground
pixel – 75% likely – but could also be – 25% likely – a background
pixel.

(Using 0 and 255 – as you might have in an 8-bit black-and-white
image – won’t work. Pytorch’s BCEWithLogitsLoss requires target
values in the range [0.0, 1.0] and will give meaningless results for
values outside of this range.)

Best.

K. Frank