One-hot encoding with autograd (Dice loss)

trypag · November 11, 2017, 6:05pm

You are right, I don’t even need it to be differentiable. Here is a new solution, however I would like to expand the original problem with a new feature : ignore_index

def dice_loss(output, target, weights=1):
    encoded_target = output.data.clone().zero_()
    encoded_target.scatter_(1, target.unsqueeze(1), 1)
    encoded_target = Variable(encoded_target)

    assert output.size() == encoded_target.size(), "Input sizes must be equal."
    assert output.dim() == 4, "Input must be a 4D Tensor."

    num = (output * encoded_target).sum(dim=3).sum(dim=2)
    den1 = output.pow(2).sum(dim=3).sum(dim=2)
    den2 = encoded_target.pow(2).sum(dim=3).sum(dim=2)

    dice = (2 * num / (den1 + den2)) * weights
    return dice.sum() / dice.size(0)

In semantic segmentation we generally have a label that we want to ignore from the loss, this requirement is already specified by the ignore_index parameter of NLLLoss.
I would like to implement the same for this dice loss, I already thought of two solutions but I don’t like them :

the worst : re-encode all the labels so that the ignore_index is a valid new label, which implies to modify my classifier layer. This is really ugly for a lot of reasons.
inside the loss function, remap ignore_label to a new label, expand the output to match the correct size, and finally ignore this label in the end. I don’t really like this solution neither, it involves copying+modifying the targets and expanding the channel dimension of the output tensor (I think).

If you have already faced this kind of problem, I would like to have your point of view on this.
Thanks !