Loss function for binary classification

KFrank · March 5, 2020, 3:33pm

Hello Yong Kuk!

The most straightforward way to do this (and also better for numerical
reasons) is to adjust your network so that it outputs raw-score logits
for its predictions, rather than probabilities. (For example, if the last
layer of your network is a Sigmoid – that converts a logit to a
probability – just get rid of the Sigmoid layer.)

Then use BCEWithLogitsLoss instead of BCELoss. This is because
BCEWithLogitsLoss offers a pos_weight argument that it uses to
reweight positive samples in the loss function. In your case you would
set pos_weight to something like 100. (BCELoss does not have a
pos_weight argument – probably just an oversight, rather than for
any particular reason.)

For some further details, please take a look at this recent thread:

Good luck!

K. Frank