Masking binary cross entropy loss

Hi,
I’m trying to implement a music transcription system. Each of the labels has a shape of [88 x num_frames]. Each element represents the activation of the corresponding piano key on the corresponding frame. So the problem can be considered a binary classification problem. For making a batch each label gets pads with zeros so all of them have some number of frames.
Now my question is how can I ignore these padded values in the loss function? Also as the data is heavily unbalanced how can I use class weights for computing the loss?
I prefer to use binary cross entropy as the loss function.

Hello Arman!

The function version of binary_cross_entropy (as distinct from the
class (function object) version, BCELoss), supports a fine-grained,
per-individual-element-of-each-sample weight argument.

So, using this, you could weight the loss contribution of each frame
separately, and, in particular, give the padding frames a weight of zero.

You can also use the weight argument to reweight your unbalanced
data, at the granularity that is the most logical for your use case.

I imagine that you wouldn’t want to reweight individual notes (but you
could). Maybe it would make sense to reweight the combination of
notes that appear together in a given frame. Anyway, for each batch,
you would go through the batch labels, sample by sample, and create
the per-sample, per-frame, per-note weights, where, again, I imagine
that the per-note weights are all equal within a given frame.

(Also, you will presumably prefer to use the “logits” version,
binary_cross_entropy_with_logits.)

Good luck.

K. Frank

2 Likes