How to calculate accuracy multi-label

netaglazer · March 20, 2020, 3:02pm

hi, i have a multi label problem
but i minimize the loss even when the model was right in only one class

i mean, if the model was right in one class:
label = [1,1,0,0,1]
predication = [1,0,0,0,0]
this is a success (loss=0)

im not sure how to calculate the accuracy in that case
when i should count a predication as +1 for acuuracy?

i know the metric of sklearn for multi label, but it dos not feet for my case

KFrank · March 20, 2020, 9:57pm

Hello Neta!

One good way to think about a multi-label classification problem
is to understand it as a set of binary classification problems (in
your case, five binary classification problems) that all use the same
input run through the same network. That is, for a given input, each
of your five labels is either absent or present, so your network is
making five yes-or-no (binary) predictions.

Understanding your problem as a set of binary classifications, we
see that the appropriate loss function is BCEWithLogitsLoss (or,
less optimally, BCELoss). These loss functions have support for
multi-label problems built in.

In this case I would say that three of your five predictions (one
“1” and two "0"s) were correct, and two (two "0’s that should have
been "1"s) were wrong.

This is a partial success (three out of five) but not perfect. So your
loss should not be 0 (assuming that 0 is the minimum value of your
loss function).

BCEWithLogitsLoss will, indeed, partially penalize this kind of
three-out-of-five prediction.

(Just to be clear, loss and accuracy are different things.)

I would count each correct prediction towards your overall
accuracy. If it turned out that all of your samples produced
three-out-of-five-correct predictions (an unlikely artificial
assumption), then your accuracy would be 60%.

Just a quick note about the specifics of BCEWithLogitsLoss:

Your labels (targets) are appropriately 0-or-1 binary class
labels. (These can be understood as probabilities – 0% chance
of the label being present vs. 100% chance. Furthermore,
BCEWithLogitsLoss will actually accept probabilities between
0 and 1 for its targets – but you don’t have to use it this way.)

Your predictions, however, should not be 0-or-1 class labels
(nor probabilities), but rather logits – that is “raw scores” that
range from -inf to +inf. You would normally get these from
the last linear layer of your network with (in your case) five
outputs.

(You could use class labels / probabilities with BCELoss, but
using logits with BCEWithLogitsLoss is better.)

Best.

K. Frank

netaglazer · March 21, 2020, 5:27pm

thank you for your answer, but i believe you didnt anderstand me

my specific problem is a bit different from a classic multi-label problem
i want to minimize my loss when the prediction is correct in only one class (or more)
and this is what i’m doing. a have a costume loss for my problem:

def new_loss_function(p, y, b):
    eps = 0.000000000000000000000000001
    losses = 0
    k = len(y[0])
    ones = torch.ones(1, 1).expand(b, k).cuda()
    loss1 = -((ones-y)*(((ones-p)+(eps)).log())).sum(dim=1)
    prod = (ones-y)*k - y*((p+ eps).log())
    loss2 = torch.min(prod, dim=1)[0]
    losses = (loss1 + loss2).sum()
    return losses / b

but my problem is, that in that case i dont know what is the best way to calculate the accuracy