CrossEntropyLoss with smooth (float/double) targets

tals · February 7, 2018, 1:53am

I’m implementing the paper Learning from Between-class Examples for Deep Sound Recognition
(Chainer implementation: https://github.com/mil-tokyo/bc_learning_image)

The method used in the paper works by mixing two inputs and their respective targets. This requires the targets to be smooth (float/double).
However, PyTorch’s nll_loss (used by CrossEntropyLoss) requires that the target tensors will be in the Long format.

One idea is to do weighted sum of hard loss for each non zero label. This seems reasonable to me, since there are two such labels in this case (because you mix two samples).
Whats the recommended way to go about this?

richard · February 7, 2018, 11:49pm

What exactly are you looking for in the loss function?

tals · February 8, 2018, 12:50am

Give C classes, targets are multi-label such that:

Targets are smooth, in the [0, 1] range
The targets sum to 1
Only two classes out of C will be activated (!= 0)

richard · February 8, 2018, 2:43am

Does binary cross entropy work for your purposes?

tals · February 8, 2018, 3:34am

In the paper (and the Chainer code) they used cross entropy, but the extra loss term in binary cross entropy might not be a problem. I’ll give it a try.

richard · February 8, 2018, 3:07pm

I’m confused. The chainer implementation uses softmax_cross_entropy, which from the docs, takes integer targets like PyTorch’s cross entropy. What am I missing here?

tals · February 8, 2018, 7:59pm

Oops! Good catch!

Turns out that in the implementation, they calculate kl divergence (the cross entropy is calculated by using tensor ops to accomodate for this).

Thanks Richard!
Sorry for the confusion