I have a problem that contains target classes from 0 to 6. But the number of targets of samples is not static. For example, while sample x_k
has targets [0,1,5,3]
, it is possible that sample x_j
can have [0,2]
.
The first thing that came to my mind is padding all labels with -100
to maximum label length (35), then use it with nn.BCEWithLogitsLoss
. Then I read that, BCEWithLogitsLoss
evaluates targets like they have 35 different classes.
What is the best way I should follow in this case? Thanks.
Assuming each sample can have zero, one, or multiple active classes, you could let the model return logits in the shape [batch_size, nb_classes]
where each value in nb_classes
would correspond to the logit for the corresponding class index.
The target could then be multi-hot encoded, where a zero would indicate an inactive and a one an active class. For the posted examples the target would thus be:
[0,1,5,3]
target = torch.tensor([[1, 1, 0, 1, 0, 1, 0]]).float()
[0, 2]
target = torch.tensor([[1, 0, 1, 0, 0, 0, 0]]).float()
My problem is more likely to sequential decoding. Each target can be repeated, like [5,2,1,0,1]
. I padded the targets and, used nn.CrossEntropyLoss()
with ignore_index=-100
parameter. But still no improvement…
Thanks for your answer.
Thanks for the update.
If I understand the use case correctly you might then be working on a multi-class sequence classification, i.e. each time step has only a single label?
In this case, you could use nn.CrossEntropyLoss
with a model output in the shape [batch_size, nb_classes, seq_len]
and a target in the shape [batch_size, seq_len]
containing the class indices in the range [0, nb_classes-1]
.