Cross entropy for soft label

rasbt · April 7, 2018, 5:27pm

Note that cross-entropy for non 0/1 labels is not symmetric, which could be an explanation for the poor performance.

E.g., consider the scenario for the binary cross entropy:

42%20PM

Or consider the following, where the ground truth and the predicted labels are shown on the x axis. I.e., you can see that if both are 0, the cost is zero. However, if both are 0.5, the cost is almost 0.7, although prediction=true label

34%20PM