How to calculate accuracy for multi label classification?

I have 100 classes and I am using BCEWithLogitsLoss how do I calculate the accuracy?
Labels : torch.tensor([0,1,0,1,0.......,1])

What do you mean you have 100 classes?

You probably meant, you have 2 classes (or one, depends on how you look at it) 0 and 1.

One way to calculate accuracy would be to round your outputs. This would make 0.5 the classification border.

correct = 0.
total = 0.
with torch.no_grad():
    #get testing data from data_loader
    for data in test_loader:
        #get images and labels
        images, labels = data
        #move data to gpu
        images = images.to(device)
        #send data through the network and save outputs
        outputs = net(images)
        #map outputs to range of 0-1
        outputs = torch.sigmoid(outputs).cpu()        #<--- since you use BCEWithLogitsLoss
        #round up and down to either 1 or 0
        predicted = np.round(outputs) #keep in mind that np.round() is a round to even function
        total += labels.size(0)
        #calculate how many images were correctly classified
        correct += (predicted == labels).sum().item()
accuracy = 100 * correct / total
print("Accuracy: {}%".format(accuracy))

Since you are using BCEWithLogitsLoss and not BCELoss I am assuming you do not have a sigmoid layer in your net. This is why I put a sigmoid function in there.

1 Like

For multi-label classification you can sk-learn librarys accuracy score function.

For every observation I have 4-5 categories and total number of categories are 100.
For 1 observation the target labels are [1,3,56,71] I have converted it into one hot vector representation

In the accuracy_score I need to round of the values of the output to 1 and 0 how do I take the threshold?

you can use np.round() function.

def accuracy(gt_S,pred_S):       
    gt_S  =np.asarray(gt_S) 
    pred_S=np.round(pred_S)      #will round to the nearest even number
    acc =  accuracy_score(gt_S,pred_S)
    f1m = f1_score(gt_S,pred_S,average = 'macro', zero_division=1)
    f1mi = f1_score(gt_S,pred_S,average = 'micro', zero_division=1)
    print('f1_Macro_Score{}'.format(f1m))
    print('f1_Micro_Score{}'.format(f1mi))
    print('Accuracy{}'.format(acc))

np.round() function rounds off to nearest value what if I get different values in the output tensor like tensor([-3.44,-2.678,-0.65,0.96])
then after rounding I get array([-3,-2,-0,1]) but for accuracy_score the values should be in 0 and 1

please try to understand the code provided by @RaLo4. he explained in detail that you need to pass your logits from sigmoid function. This will convert raw logits to probabilities which you can use for round() function.

1 Like

Okay so for calculating the loss I need to pass the logits but to calculate accuracy I need to pass the probabilities

yes. this is because the BCEWithLogitsLoss you are using has a build in sigmoid layer.

This loss combines a Sigmoid layer and the BCELoss in one single class.

See here.

It worked thanks. Training accuracy is increasing as well as the validation is increasing and loss is also at minimum but in the test set the output after applying the sigmoid the values are all zeros none is 1

By “zeroes” do you mean 0.something? Cause this would be the expected behavior. Remember, 0.5 is your threshold.

After the sigmoid your values should be in a range between 0 and 1 (so not exceeding 1.0).
After np.round they should be either 0 or 1 (everything from 0.0 to 0.5 will become 0 and everything from >0.5 to 1.0 will become 1. So 0.5 is your threshold here).

Yeah 0.0 if I get any value as 1 then that will be my predicted label right but all the values are 0. So I need to change the threshold to some value lower than 0.5

I have no idea what you are trying to say here.
Are all your results 0 after rounding?

This would mean, that they are between 0.0 and 0.5 after the sigmoid.
Keep in mind, that the output of sigmoid represents a probability.
Which would mean, that your network is never more than 50% sure that a given input belongs to the class.

If that is indeed the case, then lowering your threshold is probably not the right thing to do. Since this would suggests, that there might be a problem in your network. Like a heavily imbalanced dataset for example.

If you still want to lower your threshold, you could do this by comparing the output of the sigmoid to the threshold and setting the value either 0 or 1 accordingly.

Hello Hyo and RaLo!

Yes, from Hyo’s post, this should be understood as a imbalanced
dataset. This can be addressed with BCEWithLogitsLoss’s
pos_weight constructor argument.

This is not necessarily imbalanced in the sense of, say, class 7 vs.
class 23 (might be, might not be – from what Hyo has said, we don’t
know yet), but it is imbalanced in the sense of the presence, say, of
class 7 vs. the absence of class 7.

Let me give a few words of explanation:

This multi-label, 100-class classification problem should be
understood as 100 binary classification problems (run through the
same network “in parallel”). For each of the classes, say class 7, and
each sample, you make the binary prediction as to whether that class
is present in that sample.

Your class-present / class-absent binary-choice imbalance is (averaged
over classes) something like 5% class-present vs. 95% class-absent.
This is imbalanced enough that your network is likely being trained
to predict any one specific class being present with low probability.
It sounds like this is what your are seeing.

(The “standard” approach for using pos_weight would be to calculate
for each class c the fraction of times, f_c, that class c is present
in your samples (regardless of which other classes are present or
absent), and the calculate the weight w_c = (1 - f_c) / f_c. You
then pass the one-dimensional tensor [w_0, w_1, ..., w_99] into
BCEWithLogitsLoss’s constructor as its pos_weight argument.)

A second comment:

The most straightforward way to convert your network output to
0 vs. 1 predictions is to threshold the output logits against
0.0. You are certainly allowed to convert the logits to probabilities,
and then threshold against 0.5 (or, equivalently, round), but doing
so is not necessary. More detail is given in this post:

Good luck.

K. Frank

I have included the pos_weights in loss function

train_loss:1.2012622356414795  accuracy:0.4861111111111111 
train_loss:1.312620997428894  accuracy:0.4876543209876543 
train_loss:1.292380928993225  accuracy:0.4868827160493827 
train_loss:1.3908039331436157  accuracy:0.4691358024691358 
train_loss:1.4518522024154663  accuracy:0.4729938271604938 

train _loss is in between 1.5-1.2 and is not decreasing
I have tried different learning rates