I am trying to build a simple U-Net for segmentation. The input image as well as the labels has shape (1 x width x height). The pixel values in the label image is either 0 or 1. I am a beginner to deep learning and just started with pytorch so just want to make sure i am using the right loss function for this task. Will it be better to use binary cross entropy or categorical cross entropy for this task? Also, how can i pass weighted values to the loss function such that it gives more weight to the 1 pixels rather than 0 pixels. Thanks!
For a binary segmentation use case you could use
nn.BCEWithLogitsLoss and use the
pos_weight argument to weight the positive or negative class more.
ok so if i am using
nn.BCEWithLogitsLoss, that means i only have one class output. does that mean i only have to pass only value for the
pos_weight argument ? and is there a general formula to find out how much weight to give to each class ?
Yes, you should pass a single value to
pos_weight. From the docs:
For example, if a dataset contains 100 positive and 300 negative examples of a single class, then pos_weight for the class should be equal to
300/100=3. The loss would act as if the dataset contains
3 * 100=300positive examples.
Hello Mr. Ptrblck,
Please, how can I determine the value to pass to pos_weight? In my case, all the images contain both background and foreground (positive and negative). However, the number of foreground pixels (positive) is much smaller than that of background pixels (negative). Please, could anyone help me solve it?
Any suggestions or comments would be highly appreciated.
You could count the number of positive and negative pixels for your complete training dataset and use the average of these counts to calculate the
pos_weight or alternatively you could calculate the
pos_weigth for the current batch by counting the positive and negative pixels and use the functional API via
Please sir, how can I achieve this?
I have computed the mean() of a given image, which gave the fraction of 1s in the image. Then, I summed up these values to obtain that of the whole training data. I passed the obtained value to pos_weight. but there was no improvement.
You could select a small subset of your dataset (e.g. just 10 samples) and validate your approach (e.g. if the right
pos_weight was used or to further fine tune the
Once your model training behaves as desired, you could scale it up again.
Hi Patrice, For a segmentation model, i would suggest you to use a soft dice loss rather than using weighted nn.BCEWithLogitsLoss(). I don’t think you would see much improvments even with a weight cross entropy loss. You can implement soft dice loss like this:
import torch.nn as nn import torch class SoftDiceLoss(nn.Module): def __init__(self, weight=None, size_average=True): super(SoftDiceLoss, self).__init__() def forward(self, logits, targets): smooth = 1 num = targets.size(0) """ I am assuming the model does not have sigmoid layer in the end. if that is the case, change torch.sigmoid(logits) to simply logits """ probs = torch.sigmoid(logits) m1 = probs.view(num, -1) m2 = targets.view(num, -1) intersection = (m1 * m2) score = 2. * (intersection.sum(1) + smooth) / (m1.sum(1) + m2.sum(1) + smooth) score = 1 - score.sum() / num return score criterion = SoftDiceLoss().to(device) # to put on cuda or cpu # In the training loop output = model(images) loss = criterion(output, gt_masks) # Then do the backwards and optimizer step
Hello Anil, thank you very much for your suggestion. I will give it a try after fixing my computer. I am facing some hardware problems presently.
Ok sir, I will try that.
I recommend the loss function of BCELogisticLoss