CrossEntropyLoss now has an
ignore_index parameter, but I am not sure I understand what that means:
ignore_index (int, optional): Specifies a target value that is ignored
and does not contribute to the input gradient. When size_average is
True, the loss is averaged over non-ignored targets.
First of all, I know that
CrossEntropyLoss takes a 1-dimensional array of targets:
Target: :math:`(N)` where each value is `0 <= targets[i] <= C-1`
So then I assume that
ignore_index allows you to ignore one of the outputs in the loss calculation. I can imagine it’s useful to mask a whole bunch of outputs. Simply ignoring only one output node,what is the use-case of that?
I probably misunderstood what
ignore_index does or when do people use it?
This is used to mask a specific label.
For example, in semantic segmentation, we might have a
-1 label that stands for “dont care”, meaning that whatever you predict in that region is not taken into account in the evaluation (because it can be ambiguous).
In this case, you would set the
-1, so that those indices are not taken into account.
Thanks for the clarification!
I would like to know more. I am using the torchvision segmentation model from your repo. There, the
ignore_index is set to
If using a dataset that multiple classes to ignore during evaluation, say Cityscapes, I am manually using the training IDs and setting all those classes to
0 which is the same as the background class. How can I leverage the
I am facing issue when using ignore index on cityscape dataset for semantic segmentation. There are 19 classes, and there is one extra ignore class. I used this ignore class as 255 and used same inside cross entropy loss. When I run the model with output number of classes=19, I get assertion error from cross entropy loss,
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [3,0,0], thread: [743,0,0] Assertion
t >= 0 && t < n_classes failed.
This error is sort of obvious because we need I am passing 255 label, but considering it is passed in ignore index I am assuming this error should not come. In case this is not how ignore index work can you tell how I can achieve the solution where I want to ignore 255 label and still have 19 classes inside model.
I cannot reproduce this issue with this small code snippet:
x = torch.randn(10, 19, requires_grad=True, device='cuda')
y = torch.randint(0, 19, (10,), device='cuda')
y = 255
criterion = nn.CrossEntropyLoss(ignore_index=255)
loss = criterion(x, y)
Are you sure that the ignored index 255 is causing the issue and not another unexpected target index?
I have debugged it and you are right some other label was causing issue. Other label occured because I was using bilinear interpolation instead when I changed it to nearest the other labels were not appearing. Still although this question is not related to pytorch, I was checking if there is any label on numpy array as:
This statement was not working for numpy array and worked in case of torch tensor which is little strange ig. But anyway thanks for the solution.
Good to hear it’s working now!
That’s indeed strange, as
numpy should also be able to perform this check:
x = np.random.randint(0, 20, (100))
x[x > 10] = 255
Yah thats strange because it took me time to debug problem because of numpy statement only. Anyway thanks for your time.
What if I want several don’t care classes? for example in a time-series classification, where I have to feed in all the input, but not everything should contribute a gradient, and there are several classes.
In that case you could change the targets to use the same “ignore index” and could then pass this index to the criterion so that it’ll be ignored.