Semantic segmentation evaluation metrics (from torchvision references)

I train a DeepLabV3 and FCN (torchvision) with own dataset (2 classes: background and my object) and use the metrics calculated in the class ConfusionMatrix in the utils.py.

Can someone explain these metrics to me? I know the IoU, but what do “global correct” and “average row correct” mean?

Test:  [ 0/32]  eta: 0:00:18    time: 0.5879  data: 0.1397  max mem: 0
Test: Total time: 0:00:18
global correct: 99.5
average row correct: ['99.9', '88.2']
IoU: ['99.4', '86.9']
mean IoU: 93.2

Hi,

average row correct shows the accuracy per class - so the amount it predicted this class correct divided by the amount it predicted a pixel as this class. In the code this is given as acc = torch.diag(h) / h.sum(1), the diagonal of h are the correctly predicted pixels per class and the h.sum(1) is just the sum of each row in the ConfusionMatrix - so the total amount this class got predicted.
The global correct is your pixel accuracy, which tells you how much of your image got correctly classified - in the code: acc_global = torch.diag(h).sum() / h.sum().

Easy example

				       Actual Class
			       |	Cat	|	Dog	|
Predicted	Cat	   |	5	|	2	|
class		Dog	   |	3	|	4	|

average row correct:
cat accuracy: 5 / (5 + 2) = 0.71
dog accuracy: 4 / (3 + 4) = 0.57

global correct:
total acc = (5 + 4) / (5 + 2 + 3 + 4) =  0.64

IoU:
cat IoU = 5 / (2 + 5 + 3) = .5
dog IoU = 4 / (2 + 4 + 3) = .44
mIoU = .47 

1 Like

Thank you very much @Caruso! :slight_smile: