How to set ignore_index for a BCE task with nn.CrossEntropyLoss

James_Lee · October 10, 2021, 6:52am

Hi there. I found that torch.nn. BCELoss dindn’t offer an ignore_index param like in torch.nn. CrossEntropyLoss . I tried implementing BCE loss by calling nn.CrossEntropyLoss with preset ignore_index=-1 but failed. Does anyone got any ideas on this? Thanks.

sarat · October 10, 2021, 7:25am

What’s the error message?

Do u have labels that you want to ignore labelled as with -1?

James_Lee · October 10, 2021, 7:30am

My code,

criterion_CE = nn.CrossEntropyLoss(ignore_index=-1).cuda()

…
labels[(0.3 <= labels) & (labels <= 0.7)] = -1
…

loss = criterion_CE(out, torch.squeeze(labels).long())

Error,

/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [761,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [500,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [501,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [502,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [503,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [504,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [505,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [884,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [885,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [886,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [887,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [888,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [889,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [5,0,0], thread: [890,0,0] Assertion t >= 0 && t < n_classes failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu line=134 error=710 : device-side assert triggered
Traceback (most recent call last):
File “train.py”, line 239, in
main()
File “train.py”, line 121, in main
train(net, optimizer)
File “train.py”, line 199, in train
loss5 = criterion_CE(out, torch.squeeze(labels).long())
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 722, in _call_impl
result = self.forward(*input, **kwargs)
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/loss.py”, line 947, in forward
return F.cross_entropy(input, target, weight=self.weight,
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py”, line 2422, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py”, line 2220, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: cuda runtime error (710) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:134

James_Lee · October 10, 2021, 7:45am

And my labels had been normalized to [0,1] before training.Here I would like to “ignore” those between 0.3~0.7 when calculating BCE loss.

sarat · October 10, 2021, 7:53am

If you are creating labels for binary classification by some process, ensure that labels are 0 and 1. Any ignores ranges can be specified as -1.

Check if your targets list has 3 unique values, they are 0, 1, -1.

James_Lee · October 10, 2021, 8:12am

        criterion_CE = nn.CrossEntropyLoss(ignore_index=-1).cuda()
        labels1 = functional.interpolate(labels, size=24, mode='bilinear')
        loss1 = criterion_CE(attention1, torch.squeeze(labels1).long())

Here’s the way I caculate my final BCE loss. It seem that functional. interpolate turned the -1s into 0s. Does the functional. interpolate matter here?How could I avoid this?

SaisaiSun · June 24, 2024, 3:28am

Hi James, did you find the solution of adding ignore_index for torch.nn. BCELoss?

SaisaiSun · June 24, 2024, 3:29am

Here is one. Ignore padding area in loss computation