I’m calculating the Dice score to evaluate my model for a binary image segmentation problem.
The function I wrote in PyTorch is:
def dice_score_reduced_over_batch(x, y, smooth=1):
assert x.ndim == y.ndim
# reduction over all axes except 0 i.e. batch
axes = tuple([i for i in range(1, x.ndim)])
intersection = torch.abs((x * y).sum(dim=axes))
union = torch.abs(x.sum(dim=axes)) + torch.abs(y.sum(dim=axes))
dice = torch.mean(2. * (intersection + smooth) / (union + smooth), dim=0)
return dice
The input tensors x
and y
have the shape [batch_size, nChannel, height, width]
where nChannel=1
since ground truth is a 2d binary mask. The standard way to calculate the dice score is to compute it along the batch
axis and taking the mean value at the end (Right?). I found that the score is affected by the way inputs are flattened.
╔═══════════════════╦══════════════════╦════════╗
║ input tensor ║ flattened tensor ║ dice ║
╠═══════════════════╬══════════════════╬════════╣
║ [64, 1, 128, 128] ║ - ║ 0.2754 ║
╠═══════════════════╬══════════════════╬════════╣
║ [64, 1, 128, 128] ║ [64, 16384] ║ 0.2754 ║
╠═══════════════════╬══════════════════╬════════╣
║ [64, 1, 128, 128] ║ [1, 1048576] ║ 0.3121 ║
╚═══════════════════╩══════════════════╩════════╝
My best guess was this difference is due to the way values are being averaged but it’s not the case. The code must return the exact same answer irrespective of the arrangement/shape of the input data. How this behavior can be explained? What’s the best way to avoid it?