Accuracy (and other metrics) with hierarchical classes


I have such a problem: my classes have a tree structure and my model can predict a class at any level.


  • With these two ground truth samples: 1/4/75 and 1/4/84
  • My model can output any class from: 1, 4, 75, 84
  • If it outputs 75, then only 1/4/75 is a respective positive sample.
  • If it outputs 84, then only 1/4/84 is a respective positive sample.
  • If it outputs 1 or 4, then both 1/4/75 and 1/4/84 are positive samples.

By positive, I mean that by returning 75 or 84 the accuracy on those two samples is 1/2 and by returning 1 or 4 the accuracy on those two samples is 1.

My question is, how can I efficiently, using tensors, implement e.g. the accuracy metric? Moreover, I’d like to implement the Mean Average Precision, as my problem in general is a multi-class classification problem.

Thanks for any help e.g. links to appropriate literature!

I’ve found this paper Evaluation Measures for Hierarchical Classification that pretty much covers the theory. I’ll use the set-based measures as they seem simpler and address my case. Now, I was looking for an efficient implementation of set operations e.g. batch intersection of two tensors and I found this python - Finding non-intersection of two pytorch tensors - Stack Overflow. This seems to be it, but I need to dive deeper into the responses to check if it really solves my problem.