Jaccard coefficient right definition

Alex5 · April 8, 2020, 10:59am

Hi,
I use the Jaccard coefficient to validate my binary segmentation model. I’ve found a definition:

def get_jaccard(y_true, y_pred):
epsilon = 1e-15
intersection = (y_pred * y_true).sum(dim=-2).sum(dim=-1).sum(dim = -1)
union = y_true.sum(dim=-2).sum(dim=-1).sum(dim=-1) + y_pred.sum(dim=-2).sum(dim=-1).sum(dim = -1)
return (intersection / (union - intersection + epsilon)).mean()

Input and output have the same shape: [N, 1, 256, 256] where N is the batch size following by the image size. So it basically performs intersection and union on each image in the batch then takes the mIoU over the batch.

I’ve made a lightened implementation:

def jaccard_coeff(input, target):
eps = 1e-15
input = input.view(-1)
target = target.view(-1)
intersection = (input * target).sum()
union = (input.sum() + target.sum()) - intersection
return (intersection / (union + eps))

I directly flatten the tensors and perform operation without computing the mean. It returns different results and I wanted to know which one is the true one ?
The results are not equal so I would like to be sure I’m using the metric in the right way for validation

Thanks

vfdev-5 · April 8, 2020, 11:36am

There are several repositories where it was already implemented (probably in the correct way):

In your case, as far as I understand the difference is the following:

first implementation computes the mean over N images of Jaccard Index per image
second implementation computes Jaccard Index as all N images were concat.

Maybe, the second is more standard.

HTH

Alex5 · April 9, 2020, 8:50am

Thanks for your answer
Indeed there are similarities with the second option.