Here, a multilabel task such as Text Classification. One text can be labeled with several labels just like the following example:
text_a = […001000100001110111…], here are N labels, so the dimension of text_a is N
My idea is to put the text into RNN, then map the output to a vector distribution with the same dimension as the count of labels, i.e. N.
There are three candidate loss function I am thinking of:
- KLDivLoss
The true label and the predicted can be seen as two distribution. The model is trying to make them more similar. - MultiLabelSoftMarginLoss
Is this selection right? How to use this loss function if it is suitable for the aforementioned task? - Bayesian Personal Ranking
Is there readymade function for BPR in PyTorch?
What is the difference of using these loss function for such task? Is there any selection else?
Thanks~~~