You are right, I don’t even need it to be differentiable. Here is a new solution, however I would like to expand the original problem with a new feature : ignore_index
def dice_loss(output, target, weights=1):
encoded_target = output.data.clone().zero_()
encoded_target.scatter_(1, target.unsqueeze(1), 1)
encoded_target = Variable(encoded_target)
assert output.size() == encoded_target.size(), "Input sizes must be equal."
assert output.dim() == 4, "Input must be a 4D Tensor."
num = (output * encoded_target).sum(dim=3).sum(dim=2)
den1 = output.pow(2).sum(dim=3).sum(dim=2)
den2 = encoded_target.pow(2).sum(dim=3).sum(dim=2)
dice = (2 * num / (den1 + den2)) * weights
return dice.sum() / dice.size(0)
In semantic segmentation we generally have a label that we want to ignore from the loss, this requirement is already specified by the ignore_index
parameter of NLLLoss
.
I would like to implement the same for this dice loss, I already thought of two solutions but I don’t like them :
- the worst : re-encode all the labels so that the
ignore_index
is a valid new label, which implies to modify my classifier layer. This is really ugly for a lot of reasons. - inside the loss function, remap ignore_label to a new label, expand the output to match the correct size, and finally ignore this label in the end. I don’t really like this solution neither, it involves copying+modifying the targets and expanding the channel dimension of the output tensor (I think).
If you have already faced this kind of problem, I would like to have your point of view on this.
Thanks !