Issue with nn.CrossEntropyLoss() Segmentation

danishnazir · March 30, 2021, 1:10pm

Hi I have a prediction of shape [1,1,136,100] where Class is one and Batch size is 1.
My Label is of size [1,136,100]
Label = [[[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,255,…]]]
I am recieving cuda assert error.
Further Trace is given as
self.criterion = nn.CrossEntropyLoss(ignore_index=255)
loss_semantic_seg = self.criterion(mask_pred, labels)
File “/home/danish/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py”, line 532, in call
result = self.forward(*input, **kwargs)
File “/home/danish/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/loss.py”, line 916, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File “/home/danish/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py”, line 2021, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File “/home/danish/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py”, line 1840, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

KFrank · March 30, 2021, 2:09pm

Hi Danish!

Based on this I speculate that you are performing binary segmentation
of an image. That is, for each pixel in your sample image, you want to
predict whether it is “background” or “foreground.”

If so, you should use be using BCEWithLogitsLoss.

Let’s say that your sample images have shape [height, width]. Then
an input batch will have shape [nBatch, height, width], and your
prediction will have the same shape, with no class dimension. So,
using your sizes, both your input batch and prediction will have shape
[1, 136, 100].

Your target (your “Label”) should also have this same shape, and
contain values in the range [0.0, 1.0], inclusive. For a purely binary
target (e.g., black-and-white), these values can be exactly 0.0 and
1.0. Note that for a black-and-white segmentation mask you do not
want values of 0 and 255, but rather 0.0 and 1.0.

Lastly, your prediction should be the raw-score logits output by your
final Linear layer, and not be passed through a sigmoid() that would
convert them to probabilities.

Best.

K. Frank

danishnazir · March 30, 2021, 2:25pm

Thanks Frank, I think I have found the issue and it is because of the label range which is from [0,255]. Can you tell me how I can convert it to [0,1]? based on some threshold e.g 127.

KFrank · March 30, 2021, 3:14pm

Hi Danish!

Thresholding will certainly work. You could also use torch.where().

If you know that your values will always be exactly 0 or 255, you
could divide by 255 (and convert to float):

>>> import torch
>>> torch.__version__
'1.7.1'
>>> targ = torch.tensor ([0, 255, 0, 0, 255])
>>> targ.dtype
torch.int64
>>> targ = (targ / 255).float()
>>> targ
tensor([0., 1., 0., 0., 1.])
>>> targ.dtype
torch.float32

Best.

K. Frank