Is there any nice pre-defined function to calculate precision, recall and F1 score for multi-class multilabel classification?

I have a multi-class multi-label classification problem where there are 4 classes (happy, laughing, jumping, smiling) and each class can be positive:1 or negative:0. An input can belong to more than one class . So let’s say that for an input x , the actual labels are [1,0,0,1] and the predicted labels are [1,1,0,0]. So how to calculate the precision, recall and f1 score for this fine grained approach? Are there any predefined methods to do this?

I found this and changed it slightly to adapt it to your question. Haven’t done any rigorous testing so you should probably do that.

def F1_score(prob, label):
    prob = prob.bool()
    label = label.bool()
    epsilon = 1e-7
    TP = (prob & label).sum().float()
    TN = ((~prob) & (~label)).sum().float()
    FP = (prob & (~label)).sum().float()
    FN = ((~prob) & label).sum().float()
    #accuracy = (TP+TN)/(TP+TN+FP+FN)
    precision = torch.mean(TP / (TP + FP + epsilon))
    recall = torch.mean(TP / (TP + FN + epsilon))
    F2 = 2 * precision * recall / (precision + recall + epsilon)
    return precision, recall, F2

y_true = torch.tensor([[1,0,0,1]])
y_pred = torch.tensor([[1,1,0,0]])
print(F1_score(y_pred, y_true))

Found that PyTorch Lightning does have implementation for precision-recall and f1 score, perhaps you could use that.


Thanks for this. Take note that since you are using sum to sum over the whole tensor, torch.mean would be superfluous as you’d be taking the mean over a single value. You can remove the call to torch.mean for the precision and recall and just compute the single values on their own.