Loss functions for sequence labels?


I have a model that returns a binary sequence of predictions of length k, e.g., [0, 0.2, 0.6, 0.4, 0.8] and I have labels like [0, 1, 1, 0, 0]. How could I define the loss function here? Thanks for your help!!

You could use nn.BCELoss for your output and target.
However, this won’t use any sequential information of this data.

Are your sequences of fixed size (fixed K)? If so you can do as @ptrblck suggests and use a nn.BCELoss or nn.MultiLabelSoftMarginLoss (will treat the problem as a multi-label problem).

If K is variable then I’d suggest treating the problem as a temporal classification by having an LSTM that returns single value predictions per temporal step and using a standard cross-entropy loss.

Does it handle the variable length issue?

The examples I mentioned was bad and let me try to rephrase to a better one. For example, the first pair may be pred [[0.2, 0.8], [0.9, 0.1]] and target [1, 1], while the second pair may be pred [[0.4, 0.6], [0.7, 0.3], [0.9, 0.1]] and target [1,0, 1]. I padded them to the same length but I am not sure how I should handle the variable length issue when it comes to the loss.

Thanks for the reply! The length of k actually varies and I padded them to the same length. In this situation, is there any good way to do the loss (please see my updated example in reply to @ptrblck) ?

Also if I have limited positive label, is there a good way I could balance it?