Change float labels to integer labels for binary segmentation

         Dear engineers,

I am sorry if my question might seem silly, but I would like to learn. I am quite new to programming and segmentation tasks. In effect, I am having a dataset with labels that are not in the form of [0,1] as It should be for a binary segmentation task. The readme of the data says that the probability of the label ranges from 0 to 1 and that I should use a threshold value of 0.5 to adjust the data before feeding into the network. Please, I would like to get a better understanding of this concept as well as how I can modify the label before feeding the data into the network. When I input the data and labels to the network without thresholding the labels, the network cannot detect anything.

Any suggestions and/or comments would be highly appreciated

Best regards,

Could you explain a bit how you are feeding the target to the network?
Usually you would feed the data into the model and use its output and target to calculate the loss.
You don’t need to apply a threshold on the target to calculate the loss.
However you might want to apply a threshold on the model outputs to get the class predictions (for a binary use case).

I am grateful for your reply sir.

Yes, you are right. I am not applying a threshold on the target to calculate the loss.

I am feeding the target directly without considering any threshold.

Am I supposed to apply a threshold to the target before feeding into the network? I have noticed that the loss could decrease normally without anything strange. However, The testing results were very bad ( dice = 0.002).

Please, I would like to know when and how should the threshold be applied

Thank you for your time and patience

Usually not, but I might misunderstand your use case.
Do you want to apply a threshold to the target or your model outputs?
The latter case would make sense, while I’m not sure about the former one.

From your explanation, I think that the threshold should be applied to the model outputs instead.

Please, how should it be done?

In the mean time I had tried to apply a threshold to the target before feeding into the network (this could be seen as a data preprocessing step). I did it with the following lines of code

label  = np.asarray(label)
    label = label.reshape(128, 128, 128)
    label = np.where(label > 0.5, 1, 0)

Please, I would like to know which one is logical and more appropriate. I would also like to know the reasons

Really appreciate

If your target is a probability between 0 and 1, you could directly use it in nn.BCEWithlogitsLoss.
You don’t need to apply a threshold on the model output to calculate the loss, but could use it to get the predictions and calculate the accuracy.
If your model outputs are logits, you could use a threshold of 0.0.

Thank you very much for the suggestion. I am using a customized loss function consisting of the combination of cross_entropy loss and dice loss.

In this case, how should the threshold be applied? Please, could you provide a sample code?

Could you post the custom loss and explain what your model outputs?
Since you are using nn.CrossEntropyLoss, I assume your model returns an output in the shape [batch_size, 2, height, width]? If that’s the case, you should not apply any threshold, but get the predicted classes via preds = torch.argmax(output, 1).

If your model returns logits of the shape [batch_size, 1, height, width], you could get the predictions via: preds = output > 0.0. However, see the comment before, as I’m skeptical what your model returns.

My loss function is as follows.

             outputs = net(volume_batch)
            
            loss_seg= F.cross_entropy(outputs,label_batch.long())
            outputs_soft= F.softmax(outputs,dim=1)
            loss_seg_dice = dice_loss(outputs_soft[:,1,:,:,:],label_batch == 1)
            loss= loss_seg + loss_seg_dice

The shape of my model output is [4,1,128,128,128]. It is a 3D model and I have set 4 as batch size.

The output shape won’t work for a binary classification using F.cross_entropy and you should get an error, if label_batch contains zeros and ones (for both classes).

nn.CrossEntropyLoss expects an output in the shape [batch_size, nb_classes=2, depth, height, width] in your use case, and a target in the shape [batch_size, depth, height, width] containing values in the range [0, 1].

Also, based on your given shape, outputs_soft[:,1,:,:,:] should give you an IndexError.

Could you double check the shapes, please?

I am sorry for there was a mistake with the output shape. My output shape is [4,2, 128,128,128]. The target shape is [4,1,128,128,128].

The outputs_soft is obtained as follows.

outputs_soft= F.softmax(outputs,dim=1)

And computing the dice_loss

 loss_seg_dice = dice_loss(outputs_soft[:,1,:,:,:],label_batch == 1)

does not give any error.

Thanks for the update.

This should raise an error in F.cross_entropy, as the target contains a channel dimension, which is not expected.

Dear sir,
My sincere apologies. I have crosschecked everything and you are right. I have been on this bug for almost two days now.

target size = [4,128,128,128]
outputs size = [4,2,128,128,128]
With this loss function, how should I apply the threshold to the model output?

            outputs = net(volume_batch)
            
            loss_seg= F.cross_entropy(outputs,label_batch.long())
            outputs_soft= F.softmax(outputs,dim=1)
            loss_seg_dice = dice_loss(outputs_soft[:,1,:,:,:],label_batch == 1)
            loss= loss_seg + loss_seg_dice

My sincere gratitude for your time, patience and kindness

You don’t need to apply a threshold for these shapes and use case.
The dice_loss should probably get the predicted class labels via:

preds = torch.argmax(outputs, 1)

Could you remove the softmax and pass preds to dice_loss?

Thank you sir! I will do so and see how it goes

I have removed the softmax and have modified the loss as you suggested.

                outputs = net(volume_batch)                                                        
                loss_seg= F.cross_entropy(outputs,label_batch.long())
                outputs_soft= torch.argmax(outputs, 1)
                loss_seg_dice = dice_loss(outputs_soft,label_batch == 1)
                loss= loss_seg+ loss_seg_dice

However, the computed dice score = 0.000 and It does not change after training for more than 300 epochs. Please, what could be the problem?

Your dice_loss implementation might be wrong or the inputs are not what I would expect.
Could you post the code for the dice_loss implementation, please?

Hello sir,
The dice_loss implementation is as follows:

def dice_loss(score, target):
    target = target.float()
    smooth = 1e-5
    intersect = torch.sum(score * target)
    y_sum = torch.sum(target * target)
    z_sum = torch.sum(score * score)
    loss = (2 * intersect + smooth) / (z_sum + y_sum + smooth)
    loss = 1 - loss
    return loss

As I explained earlier, the problem is about the targets. The targets are not in the form of 0 (background) and 1(foreground) as it should be in the case of binary segmentation. Instead, the values of the foreground are float values ranging from 0.1 to 1. After reading online, I found that it could also be seen as a regression problem rather than classification. I am yet to get a clear picture of everything given that I am a beginner. Please, any explanation and suggestions would be highly appreciated.

Thanks for mentioning it again.
The dice loss is used for discrete data, so I doubt it can work for a regression task.

Since your labels are in the range [0, 1], I would again recommend to use nn.BCEWithLogitsLoss.

Alternatively, since you are already rounding the targets to 0 and 1 to be able to use F.cross_entropy, you could also apply the same for the dice loss calculation.

Yes sir, that is what I have applied so far and it seems to work

label  = np.asarray(label)
    label = label.reshape(128, 128, 128)
    label = np.where(label > 0.5, 1, 0)

Thank you very much for following up.