I’m working on a multi-label classification. The number of labels are 122, and for each sample, a maximum of 10 of these labels are one and the rest are zero. (so we have sparse multi-hot encoding)
I used BCEWithLogitsLoss
as a loss function in two ways:
1- without weight
criterion = nn.BCEWithLogitsLoss()
outputs = self.model(inputs)
loss = criterion(outputs, targets)
2- with weight
criterion = nn.BCEWithLogitsLoss(reduction=‘none’)
outputs = self.model(inputs)
loss = criterion(outputs, targets)
loss = (loss * self.CLASS_WEIGHT).mean()
In both cases, after a few iterations of the first epoch, the f1-score value is zero and the reason is that the model predictions are zero. Since Target tensor is spars, what is the solution to improve the loss function to prevent this problem?