How does CE loss get computed concretely on multi-class segmentation?

Hello,I found someone use it like this:
logits has shape [2, 22, 256, 256]
mask has shape [2, 256, 256]

and category type is encoded into mask,for example,if a pixel is belongs to a people,it will has value==17(human’s code in VOC dataset).

Will CELoss one-hot encode mask to generate 22 channels firstly,then pairing generated mask channels with logit channels one by one to compute cross entropy?


CELoss can be implemented without having to create the one hot encoding. So to reduce memory needs, our CE function takes the label as a LongTensor that contains the index to the correct class.