[SOLVED] Class Weighed Binary Crossentropy not working, even with equal weights

Hi Andrei!

First, to answer the question I think you’re asking:

You should be using (as in the comment in your code)
BCEWithLogitsLoss.

BCEWithLogitsLoss supports sample weights, which you
can use for class weights.

Let’s say you have class weight w_1 for class 1, and w_0
for class 0. Let w_n be the sample weight for sample n.
Simply set w_n = w_1 if y_n = 1, and w_n = w_0 if
y_n = 0. (This assumes that the y_n are either 0 or 1, as
they should be if they are binary class labels.)

Now some comments:

Note that using class weights w_1 = w_0 = 1/2 doesn’t give
you the same result as an unweighted loss function. It gives
you 1/2 the unweighted loss function. (The loss for each sample
is multiplied by 1/2). This doesn’t matter a lot, but, for example,
with plain-vanilla stochastic-gradient-descent optimization, it
has the effect of reducing your learning rate by a factor of 1/2.

Instead of clamping the sigmoid of your output, you should be
using torch.nn.LogSigmoid. This avoids the problem of
large negative → sigmoid → 0 → log → -inf.

In general, when you are testing / debugging something like
this, instead of running your full training code with a “default”
value like weights = (0.5, 0.5), you should try calling your
function on a single sample, with your default value and
compare the single numerical result with the result of the
standard unweighted function you are trying to mimic (in
this case BCEWithLogitsLoss). Only when you are happy
that you have that working should you try running a single
batch, and when that is working, try the training.

Lastly, I think this discussion – especially the comment about
avoiding clamping – applies to your earlier thread:

and its linked thread:

Best regards.

K. Frank