Class-wise accuacy

for batch_idx, (inputs, labels) in enumerate(test_loader):
     inputs, labels = inputs.cuda(), labels.cuda()
     with torch.no_grad():
         outputs = model(inputs)
     _, predicted = outputs.max(1)
     total += labels.size(0)
     correct += predicted.eq(labels).sum().item()
 print(correct / total)

nb_classes = 10
confusion_matrix = torch.zeros(nb_classes, nb_classes)
for batch_idx, (inputs, labels) in enumerate(test_loader):
    inputs = inputs.cuda()
    labels = labels.cuda()
    outputs = net(inputs)
    _, preds = torch.max(outputs, 1)
    for t, p in zip(labels.view(-1), preds.view(-1)):
        confusion_matrix[t.long(), p.long()] += 1
ele_wise_acc = confusion_matrix.diag() / confusion_matrix.sum(1) # Class-wise acc
print(ele_wise_acc.mean() * 100) # Total acc

[1]: Calculate accuracy code
[2]: Calculate class-wise accuracy from How to find individual class Accuracy.

[1] and [2] have different accuracy.
I think both of them are looks fine, Anyone can find problems?

Hi oasjd7!

In short, the two results will differ when the classes don’t all have
the same number of samples (and some other conditions that
aren’t the main point).

In [1] you are calculating the number of correct predictions
(regardless of the specific class) divided by the total number
of predictions. This is what I would typically call “accuracy.”

In [2] you are calculating the accuracy for each class separately,
and then taking the mean of those class accuracies. This is
not the same.

To see this, consider a case where you have two classes, but
class A has 1000 samples and class B has 10 samples. Let’s
say you get all 1000 class A predictions wrong and get all 10
class B predictions right. Your overall accuracy ([1]) will be
10 / 1010, which is about 1%. But your class A accuracy is
0% and your class B accuracy is 100%, so averaging those
accuracies over the two classes will give you 50%.


K. Frank