BCELoss for MultiClass problem


(Abhijeet Parida) #1

Is it a possibility to calculate the Multiclass crossentropy loss by successively using the nn.BCELoss() implementation

This is what I have tried.

# Implementation of the Multiclass Cross Entropy classification
def SegLossFn(predictions,targets):
    _, c, _, _ = predictions.size()
    loss=0
    m=nn.Sigmoid()
    loss_fn=nn.BCELoss()
    #BCE-> MCE by adding for each of the classes BCE
    for i in range(c):
        loss+=loss_fn(m(predictions[0][i]),Variable(targets[i][0]).cuda())
            
    return loss


#2

For multi-class classification you would usually just use nn.CrossEntropyLoss, and I don’t think you’ll end up with the same result, as you are calling torch.sigmoid on each prediction.

For multi-label classification, you might use nn.BCELoss with hot-encoded targets and won’t need a for loop.

Could you explain your use case a bit as I’m currently not sure to understand it properly?


(Abhijeet Parida) #3

My question is about the the addition operation and how it is handled. What happens the grad_fn? how do they get added?


#4

The losses will get accumulated and your loss tensor will get grad_fn=<AddBackward0> as its grad_fn.


(Abhijeet Parida) #5

The problem for not using nn.CrossEntropyLosswas my predicted output is of size [N,C,H,W] and the target is of [N,C,H,W]. [H,W] in the C channel of the target is 0 or 1 based on presence or absence of class C. Is their a work around?


#6

nn.CrossEntropyLoss expects a torch.LongTensor containing the class indices without the channel dimension. In your case, you could simply use:

targets = torch.argmax(targets, 1)

to create your target tensor.


(Nam Vo) #7

I’m not sure if there is so called “Multiclass Cross Entropy”.
But doing binary classification for each class makes sense, so I think what you have tried is correct.


(Sebastian Raschka) #8

It’s usually called multi-category cross entropy but yeah, the CrossEntropyLoss is essentially that. Just be careful, the CrossEntropyLoss takes the logits as inputs (before softmax) and the BCELoss takes the probabilities as input (after logistic sigmoid)


(Nam Vo) #9

Thanks, I think I confuse multi-class with multi-label, where they do multiple BCE like that


(Sebastian Raschka) #10

I don’t think you confused anything, because both multi-label cross entropy and binary cross entropy work for dealing with multi-class problems. The difference though is that in binary cross entropy, mathematically, you assume that the classes are independent.


(Abhijeet Parida) #11

i think this particular implementation has some problem beause when I do loss.backward() improvement is seen only in the last class. The other previous channels just have complement of the last class.

For example:
Expected Output

[[1, 0],             [[0, 1],           [[0, 0],
 [0, 0]]              [0, 0]]            [1, 1]] 

Got Output

[[1, 1],             [[1, 1],           [[0, 0],
 [0, 0]]              [0, 0]]            [1, 1]]