Loss functions for sequence labels?

AlexisW · March 14, 2019, 7:51am

Hello,

I have a model that returns a binary sequence of predictions of length k, e.g., [0, 0.2, 0.6, 0.4, 0.8] and I have labels like [0, 1, 1, 0, 0]. How could I define the loss function here? Thanks for your help!!

ptrblck · March 14, 2019, 12:40pm

You could use nn.BCELoss for your output and target.
However, this won’t use any sequential information of this data.

imaluengo · March 14, 2019, 12:44pm

Are your sequences of fixed size (fixed K)? If so you can do as @ptrblck suggests and use a nn.BCELoss or nn.MultiLabelSoftMarginLoss (will treat the problem as a multi-label problem).

If K is variable then I’d suggest treating the problem as a temporal classification by having an LSTM that returns single value predictions per temporal step and using a standard cross-entropy loss.

AlexisW · March 14, 2019, 8:58pm

Does it handle the variable length issue?

The examples I mentioned was bad and let me try to rephrase to a better one. For example, the first pair may be pred [[0.2, 0.8], [0.9, 0.1]] and target [1, 1], while the second pair may be pred [[0.4, 0.6], [0.7, 0.3], [0.9, 0.1]] and target [1,0, 1]. I padded them to the same length but I am not sure how I should handle the variable length issue when it comes to the loss.

AlexisW · March 14, 2019, 8:59pm

Thanks for the reply! The length of k actually varies and I padded them to the same length. In this situation, is there any good way to do the loss (please see my updated example in reply to @ptrblck) ?

AlexisW · March 15, 2019, 12:53am

Also if I have limited positive label, is there a good way I could balance it?