Normalizing masks in image segmentation

Hi guys

I have seen so many image segmentation applications in which people apply normalization to images and “masks” at the same time. It seems to me normalizing the mask simply means messing up all the class labels encoded there, thus training the model on labels which has no meaning. Am I missing something? Thank you for your help!

Your concern is correct assuming the mask contain class indices. In these cases transformations such as resizing can still be applied to a mask tensor but the nearest mode should be selected so that class indices are kept and are not interpolated. Often if a wrong transformation is used float values will be created in the mask, which would then fail during the loss calculation as e.g. nn.CrossEntropyLoss expects LongTensors for the target. In any case, I would expect official examples as well as popular repositories to take care of these mistakes.

Thank you so much for your insight @ptrblck . Resizing is alright but I saw lots of notebooks, posts etc applying the very same transformation to masks as well such as adding noise or blurring the class indices. Your clarification was important!

These transformations applied on the mask (again, assuming the mask contains class indices) sounds quite dangerous, but I would also expect to see dtype mismatches.
In case you are seeing these transformation applied on a mask with and additional transformation into LongTensors via mask = mask.long(), I’m sure the authors of the repository would be happy to hear about your valid concern. :slight_smile: