Ignore_index() in nn.CrossEntropyLoss() for semantic segmentation

I am trying to train a fully convolutional net from scratch for a semantic segmentation task, but the training set I have is sparse, meaning that I have to ignore pixels that do not contain information (label=0) while training.
Otherwise, I have 5 classes I am interested to retrieve.
To achieve that, I just added the argument ignore_index to the cross entropy loss function to drop the 0-labeled pixels.
However, I am confused because I would expect the net to output 5 scores (for the 5 classes of interest), although it also gives me a score for the class 0.

Does anyone have any ideas on why I get 6 class scores, expecting only 5 ?

The ignore_index argument just masks the loss for the current class.
From the docs:

ignore_index (int, optional) – Specifies a target value that is ignored and does not contribute to the input gradient. When size_average is True, the loss is averaged over non-ignored targets.

If you would like to ignore it in the prediction, you could just slice the output:

torch.randn(1, 5)
_, preds = torch.max(output[0, 1:], 0)
1 Like

Thank you,
I hesitated slicing the output score vector since I am not sure of what to do if, at prediction, the ignored class gets a very high score and the other ones (non-ignored) are likely equal…

Thanks man. Helps a lot.

1 Like

Hey, I was facing a similar issue but I am not sure why it is working here (there are total 19 classes in cityscape dataset with one ignore label, since with ignore label there will be 20 classes, the model they should create must have 20 classes but it is not the case, they are using only 19 classes. When I try to do similar thing(although with a different pipeline), the assertion error of class comes). Please check this out, thanks in advance.

Is this a double post from here or another issue we should have a look at? :slight_smile: