Predictions stuck at zero when positive label (1) is only 16% of data

KFrank · March 1, 2022, 3:52pm

Hi Mona!

Typically, we reweight classes with the reciprocal of how often they appear.
However, CrossEntropyLoss doesn’t care about the overall weight, as
it computes the weighted average of the individual sample losses.

>>> class_frequencies = torch.tensor ([0.84, 0.16])
>>> class_weightsA = 1 / class_frequencies
>>> class_weightsA
tensor([1.1905, 6.2500])
>>> class_weightsB = class_frequencies[0] * class_weightsA
>>> class_weightsB
tensor([1.0000, 5.2500])

Here, class_weightsA and class_weightsB differ only in their overall
scale – the relative weights of “class-0” and “class-1” are the same, so
they’ll give the same result with CrossEntropyLoss.

I chose to use class_weightsB for my example to make numerically
obvious the relationship to pos_weight in the BCEWithLogitsLoss
version.

Best.

K. Frank