Why the min loss is not zero in neither of MultiLabelSoftMarginLoss and BCEWithLogitsLoss

I am trying to understand how to use MultiLabelSoftMarginLoss and BCEWithLogitsLoss.
my question is in two parts, for simplicity I consider the case of multi-lable classification for images.

  1. what should be the format for the targets? should it be 0 and 1 where 0 means that the input does not contain that category and 1 means that the input includes that category?

  2. why the minimum of neither of those two losses is not zero when I give them the input equal to targets? shouldn’t that be the case? is it a bug or I am missing something?

Here is an example, where we have in total 6 categories, and the input has objects from categories 1,3,5 out of 6 categories. So we have our target for this image define as follows (please let me know if the target is defined incorrectly).

BCEWithLogitsLoss = nn.BCEWithLogitsLoss()
MultiLabelSoftMarginLoss= nn.MultiLabelSoftMarginLoss()

target= torch.tensor([[1,0,1,0,1,0]],dtype=torch.float32) 
output= torch.zeros([1,6])
output[0,0]=1
output[0,2]=1
output[0,4]=1
output.requires_grad=True
loss_1 = BCEWithLogitsLoss(output,target) # loss_1 will be equal to 0.5032
loss_2 = MultiLabelSoftMarginLoss(output,target) # loss_2 will be equal to 0.5032

shouldn’t the losses be zero? what I am missing here? why they are even equal? what should be the min value if not zero?

@ptrblck and @albanD appreciate your thoughts on this :slight_smile:

Hi Seyeeet!

BCEWithLogitsLoss takes raw-score logits (that run from -inf to
inf) as its input, but takes probabilities (0.0 to 1.0) as its target.

Instead of using 0.0 and 1.0 for your input, use very large negative
and positive values. Then you will get a loss of (something close to)
0.0.

(In general, you shouldn’t use BCELoss, but if you apply it to your
example values, you will get 0.0 for the loss.)

BCEWithLogitsLoss and MultiLabelSoftMarginLoss are essentially
the same. See, for example:

Best.

K. Frank

1 Like