I have some ordinal classes with a range of 1-5 each and some implicit relationships, and decided to multi-hot encode those such that each level gets an activation like
{
1: [0, 0, 0, 0]
2: [1, 0, 0, 0]
...
5: [1, 1, 1, 1]
}
And then concatenate for each class so the resulting vector might look like
[0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1]
What is a good loss function to use here? Is MultiLabelMarginLoss
sufficient? Not sure if I can get the proper penalizing effect from a one-vs-all approach given multiple bits belong to the same class each.