Should i use nn.BCEWithLogitsLoss() or Cross Entropy loss for segmentation

ayadav01 · May 3, 2020, 1:16am

I am trying to build a simple U-Net for segmentation. The input image as well as the labels has shape (1 x width x height). The pixel values in the label image is either 0 or 1. I am a beginner to deep learning and just started with pytorch so just want to make sure i am using the right loss function for this task. Will it be better to use binary cross entropy or categorical cross entropy for this task? Also, how can i pass weighted values to the loss function such that it gives more weight to the 1 pixels rather than 0 pixels. Thanks!

ptrblck · May 3, 2020, 3:29am

For a binary segmentation use case you could use nn.BCEWithLogitsLoss and use the pos_weight argument to weight the positive or negative class more.

ayadav01 · May 3, 2020, 8:26am

ok so if i am using nn.BCEWithLogitsLoss, that means i only have one class output. does that mean i only have to pass only value for the pos_weight argument ? and is there a general formula to find out how much weight to give to each class ?

ptrblck · May 4, 2020, 12:06am

Yes, you should pass a single value to pos_weight. From the docs:

For example, if a dataset contains 100 positive and 300 negative examples of a single class, then pos_weight for the class should be equal to 300/100=3 . The loss would act as if the dataset contains 3 * 100=300 positive examples.

Patrice · August 20, 2020, 12:02pm

Hello Mr. Ptrblck,
Please, how can I determine the value to pass to pos_weight? In my case, all the images contain both background and foreground (positive and negative). However, the number of foreground pixels (positive) is much smaller than that of background pixels (negative). Please, could anyone help me solve it?
Any suggestions or comments would be highly appreciated.

ptrblck · August 20, 2020, 8:14pm

You could count the number of positive and negative pixels for your complete training dataset and use the average of these counts to calculate the pos_weight or alternatively you could calculate the pos_weigth for the current batch by counting the positive and negative pixels and use the functional API via F.binary_cross_entropy_with_logits.

Patrice · August 21, 2020, 4:17am

Please sir, how can I achieve this?
I have computed the mean() of a given image, which gave the fraction of 1s in the image. Then, I summed up these values to obtain that of the whole training data. I passed the obtained value to pos_weight. but there was no improvement.

ptrblck · August 21, 2020, 7:39am

You could select a small subset of your dataset (e.g. just 10 samples) and validate your approach (e.g. if the right pos_weight was used or to further fine tune the pos_weight).
Once your model training behaves as desired, you could scale it up again.

ayadav01 · August 21, 2020, 8:31am

Hi Patrice, For a segmentation model, i would suggest you to use a soft dice loss rather than using weighted nn.BCEWithLogitsLoss(). I don’t think you would see much improvments even with a weight cross entropy loss. You can implement soft dice loss like this:

import torch.nn as nn
import torch

class SoftDiceLoss(nn.Module):
    def __init__(self, weight=None, size_average=True):
        super(SoftDiceLoss, self).__init__()

    def forward(self, logits, targets):
        smooth = 1
        num = targets.size(0)
        """
       I am assuming the model does not have sigmoid layer in the end. if that is the case, change torch.sigmoid(logits) to simply logits
        """
        probs = torch.sigmoid(logits)
        m1 = probs.view(num, -1)
        m2 = targets.view(num, -1)
        intersection = (m1 * m2)

        score = 2. * (intersection.sum(1) + smooth) / (m1.sum(1) + m2.sum(1) + smooth)
        score = 1 - score.sum() / num
        return score

criterion = SoftDiceLoss().to(device) # to put on cuda or cpu
# In the training loop
output = model(images)
loss = criterion(output, gt_masks)
# Then do the backwards and optimizer step

Patrice · August 22, 2020, 1:35am

Hello Anil, thank you very much for your suggestion. I will give it a try after fixing my computer. I am facing some hardware problems presently.

Patrice · August 22, 2020, 1:36am

Ok sir, I will try that.

morisa66 · August 22, 2020, 3:20am

I recommend the loss function of BCELogisticLoss