Got RuntimeError: Boolean value of Tensor with more than one value is ambiguous during training

Hello. I’m trying to set ignore_index for labels during training with CE loss, my code seemed as follows,

    for i, data in enumerate(train_loader):
        labels[labels > 0.9] = 1
        labels[labels < 0.1] = 0

        # bug occurred here
        labels[0.1<=labels<=0.9] = -1

        loss = nn.CrossEntropyLoss(out, labels, ignore_index=-1)

I got RuntimeError: Boolean value of Tensor with more than one value is ambiguous
during training.

What’s wrong here? Thanks.

You have to combine the two conditions via &:

labels[(0.1<=labels) & (labels<=0.9)]

@ptrblck Thanks for your prompt reply. After fix this problem, when I ran CUDA_LAUNCH_BLOCKING=1 python train.py I got the following error,

/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [342,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [343,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [344,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [345,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [346,0,0] Assertion t >= 0 && t < n_classes failed.
/opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [347,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File “train.py”, line 238, in
main()
File “train.py”, line 126, in main
train(net, optimizer)
File “train.py”, line 197, in train
loss1 = criterion_CE(out, torch.squeeze(labels).long())
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 722, in _call_impl
result = self.forward(*input, **kwargs)
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/loss.py”, line 947, in forward
return F.cross_entropy(input, target, weight=self.weight,
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py”, line 2422, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File “/home/public/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py”, line 2220, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: cuda runtime error (710) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1595629395347/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:134

It seemed that I should use nn.BCE rather than nn.CrossEntropyLoss here? I found only CrossEntropyLoss support ignore_index for classification while nn.BCE didn’t?

Here’s my loss,

        criterion_CE = nn.CrossEntropyLoss(ignore_index=-1).cuda()
        loss = criterion_CE(out, torch.squeeze(labels).long())

Answered in your cross-post.