How to convert a 55-color PNG segmentation mask in a [55,h,w] tensor?

Olivier-CR · October 12, 2021, 4:54pm

Hi,
I’m trying to train a DeepLabV3 model on a segmentation dataset where annotations come as PNG pictures with 55 different colors.

When I read such an annotation PNG with PyTorch image_read I get the following tensor (shape [3,H,W])

tensor([[[135, 135, 135,  ..., 135, 135, 135],
         [135, 135, 135,  ..., 135, 135, 135],
         [135, 135, 135,  ..., 135, 135, 135],
         ...,
         [255, 255, 255,  ..., 200, 200, 200],
         [255, 255, 255,  ..., 200, 200, 200],
         [255, 255, 255,  ..., 200, 200, 200]],

        [[206, 206, 206,  ..., 206, 206, 206],
         [206, 206, 206,  ..., 206, 206, 206],
         [206, 206, 206,  ..., 206, 206, 206],
         ...,
         [  0,   0,   0,  ..., 125, 125, 125],
         [  0,   0,   0,  ..., 125, 125, 125],
         [  0,   0,   0,  ..., 125, 125, 125]],

I think my DeepLab model will need this as a 1-hot tensor, {55, H, W}.
I cannot find anywhere any code to do that transformation from {3, H, W} to {55, H, W}. Anybody has a snippet of code to share? How segmentation datasets usually handle this?

ptrblck · October 13, 2021, 7:32am

This post gives you an example of a color to class index mapping for the targets of a segmentation use case.

Olivier-CR · October 13, 2021, 8:01am

wow absolutely epic thanks @ptrblck