BCEWithLogitsLoss() and sigmoid layer

BCEWithLogitsLoss() is used in multi-label classification. Since BCEWithLogitsLoss() combines one sigmoid layer, there is no need to set a sigmoid layer in the model when training. By the way, when evaluating, should we put the sigmoid layer in the model and then binarize activations based on 0.5 threshold?

[0, 0, 1, 1, 0, 0, 0, 1]

[0, 1, 1, 1, 0]

No, using threshold 0 on your raw output is equivalent to the threshold of 0.5 after a sigmoid layer.

1 Like

You are right. I was confused a little bit. Thanks!