Top_k accuracy for multilabel classification

jpainam · February 25, 2021, 6:46am

Hi.
Top-K Metrics are widely used in assessing the quality of Multi-Label classification.
I didn’t find metrics on pytorch that can be used for monitoring multi-label classification training out of the box. I tried using this (Edited following @Eta_C comment ).
But got some error

output = torch.randn(64, 134)
target = torch.randn(64, 134)

def accuracy(output, target, topk=(1,)):
    maxk = max(topk)
    batch_size = target.size(0)

    _, pred = output.topk(maxk, dim=1, largest=True, sorted=True)
    pred = pred.t()
    correct = pred.eq(target.view(1, -1).expand_as(pred))

    ret = []
    for k in topk:
        correct_k = correct[:k].contiguous().view(-1).float().sum(dim=0, keepdim=True)
        ret.append(correct_k.mul_(1. / batch_size))
    return ret
## using
acc = accuracy(output, target, topk=(1,2,5))

An error

Traceback (most recent call last):
  File "main.py", line 77, in <module>
    trainer.train(epoch)
  File "/home/xx/xxx/trainer.py", line 152, in train
    prec = accuracy(pred_attrs.data, attrs.data, topk=(1, 2, 5))
  File "/home/xx/xxx/evaluation/classification.py", line 23, in accuracy
    correct = pred.eq(target.view(1, -1).expand_as(pred))
RuntimeError: The expanded size of the tensor (64) must match the existing size (8576) at non-singleton dimension 1.  Target sizes: [5, 64].  Tensor sizes: [1, 8576]

Any suggestions how to go about this implementation?
Thank you

Eta_C · February 25, 2021, 7:23am

Hey Bro, Maybe your code comes from

github.com

bearpaw/pytorch-classification/blob/cc9106d598ff1fe375cc030873ceacfea0499d77/utils/eval.py

from __future__ import print_function, absolute_import

__all__ = ['accuracy']

def accuracy(output, target, topk=(1,)):
    """Computes the precision@k for the specified values of k"""
    maxk = max(topk)
    batch_size = target.size(0)

    _, pred = output.topk(maxk, 1, True, True)
    pred = pred.t()
    correct = pred.eq(target.view(1, -1).expand_as(pred))

    res = []
    for k in topk:
        correct_k = correct[:k].view(-1).float().sum(0)
        res.append(correct_k.mul_(100.0 / batch_size))
    return res

See Here,

If we take the top-3 accuracy for this, the correct class only needs to be in the top three predicted classes to count.

Assume that you have 64 samples, it should be

output = torch.randn(64, 134)
target = torch.randn(64)

jpainam · February 25, 2021, 7:54am

I used this code a while ago for a classification problem. Don’t really remember on which StackOverflow thread. But thank you for googling and pointing out the github repo (it may not be the original repo).
But my problem still remains. The implementation works for classification (binary of multi class), not for multi-label classification. In multi-label classification, a sample can have more than one category.
For instance, for 5 classes, a target for a sample x could be

target_x = [1, 0, 1, 0, 0]
# then for 64 samples, the targets are [64, 5] not [64] 
# I'm using  134 categories

Multi-label classification is mostly used in attribute classification where a given image can have more than one attribute. Each attribute is treated as binary classification problem. Hope you understand now

Eta_C · February 25, 2021, 8:59am

So target is a boolean tensor. I think, maybe…

output = torch.randn(64, 134)
target = torch.randn(64, 134) > 0.5  # multi hot

def accuracy(output, target, topk=(1,)):
    maxk = max(topk)
    batch_size = target.size(0)

    _, pred = output.topk(maxk, dim=1, largest=True, sorted=True)
    print(pred.shape)
    ret = []
    for k in topk:
        correct = (target * torch.zeros_like(target).scatter(1, pred[:, :k], 1)).float()
        ret.append(correct.sum() / target.sum())
    return ret
## using
acc = accuracy(output, target, topk=(1,2,5))
print(acc)

jpainam · February 25, 2021, 9:56am

yes, target is a boolean tensor. Thank you. it’s working as expected.
Thanks