I’m implementing the paper
Learning from Between-class Examples for Deep Sound Recognition
(Chainer implementation: https://github.com/mil-tokyo/bc_learning_image)
The method used in the paper works by mixing two inputs and their respective targets. This requires the targets to be smooth (float/double).
nll_loss (used by
CrossEntropyLoss) requires that the target tensors will be in the Long format.
One idea is to do weighted sum of hard loss for each non zero label. This seems reasonable to me, since there are two such labels in this case (because you mix two samples).
Whats the recommended way to go about this?
What exactly are you looking for in the loss function?
C classes, targets are multi-label such that:
- Targets are smooth, in the
[0, 1] range
- The targets sum to 1
- Only two classes out of C will be activated (!= 0)
Does binary cross entropy work for your purposes?
In the paper (and the Chainer code) they used cross entropy, but the extra loss term in binary cross entropy might not be a problem. I’ll give it a try.
I’m confused. The chainer implementation uses
softmax_cross_entropy, which from the docs, takes integer targets like PyTorch’s cross entropy. What am I missing here?
Oops! Good catch!
Turns out that in the implementation, they calculate kl divergence (the cross entropy is calculated by using tensor ops to accomodate for this).
Sorry for the confusion