Multi-class Semantic segmentation format

Hello All,

I am trying to adapt my binary semantic segmentation code to multi-class semantic-seg.

I am having trouble figuring out how to convert my training masks to an appropriate format/shape.

My training labels are a single band image (grayscale) where unique integer (eg 1,2,3,4) values represent a label. How should this be transformed to the correct format?


For a multi-class segmentation you would most likely use nn.CrossEntropyLoss, which expects the target masks to have the shape [batch_size, height, width] and contain the class indices in the range [0, nb_classes-1].
If your current targets are grayscale images with discrete gray values for each class, you could map these values to class indices e.g. by using a lookup table.