Calculating accuracy for a multi-label classification problem

I used CrossEntropyLoss before in a single-label classification problem and then I could calculate the accuracy like this:

_, predicted = torch.max(, 1)
total = len(labels)
correct = (predicted == labels).sum()
accuracy = 100 * correct / total

Now I am trying to move on to a multi-label classification problem using MultiLabelMarginLoss or MultiLabelSoftMarginLoss (is this the right equivalent to choose?), I am unsure how to calculate the accuracy of classified_labels?

I came up with this for criterion = nn.MultiLabelSoftMarginLoss():

predicted = torch.clamp(torch.round(, 0, 1).numpy().astype(int)
total = len(labels) * 17
correct = (predicted == labels.numpy().astype(int)).sum()
train_accuracy = 100 * correct / total

Is this correct?


I came up with the answer: torch.sigmoid(classified_labels).data > 0.5 will give the correct labels with MultiLabelSoftMarginLoss().


Hi @bartolsthoorn,

there seem to be few possible ways to try to do this,

We really need a working example, so if you do get it working, perhaps you could post a gist?

Kind regards,


1 Like

Here is a simple gist showing how to do multi label classification:



Accuracy is probably not what you want for Multi-Label classification especially if your classes are unbalanced.

Let’s say you have a class A present for 90% of your dataset, and classes B and C that occurs about 10% of the time, a model that always return class A and never class B and C will have 70% accuracy but no predictive power.

Your metric should be taking into account False positive and False negative. You can check Matthews correlation coefficient or Fbeta score (often F1 score) or Hamming loss that are designed for Multilabel classification and are implemented in Scikit-learn.
Also make sure to read on Precision and Recall.


So I coded torch.sigmoid onto my classified labels and I get a numbers approximately close to zero and 1. Will correct += (pred == y).sum().item() Compare numbers that are relatively close or does the prediction need to be exactly the same? If so how do I convert approximate numbers into whole number for 1 and 0?

The comparison will most likely not work, if you don’t use a small epsilon to compare the prediction with your target. A better way could be to use a threshold on your output, so that your predictions will contain zeros and ones.
A default value might be 0.5, whereas you might want to tweak your metric.

1 Like