Loss Function: CrossEntropyLoss VS BCEWithLogitsLoss

rbidanta · April 7, 2018, 3:24pm

Hi All,

This is a conceptual question on Loss Functions, I was trying to understand the scenarios where I should use a BCEWithLogitsLoss over CrossEntropyLoss. (Apologies if this is a too naive question to ask )

I am currently working on an Image Segmentation project where I intend to use UNET model. The paper quotes “The energy function is computed by a pixel-wise soft-max over the final feature map combined with the cross entropy loss function”, and going by the pytorch documentation it seems this loss is similar to BCEWithLogitsLoss. Any guidance would be really helpful.

Thanks,

smth · April 7, 2018, 3:28pm

if you are doing image segmentation with PixelWise, just use CrossEntropyLoss over your output channel dimension.

BCEWithLogitsLoss is needed when you have soft-labels (i.e. instead of {dog at (1, 1), cat at (4, 20)} it is like {dog with strength 0.3 at (1,1), …}

rbidanta · April 7, 2018, 3:36pm

Thanks @smth

Essentially if I have a single channel in the 2D output prediction, doing a crossentropyloss with the target mask would be a good approach?

BrianDo2005 · June 4, 2018, 9:27pm

@rbidanta:

If I understand correctly - BCEWithLogitsLoss() may be more appropriate for your problem.

Assuming that you did not applied any activation function at your last conv-1x1-layer. Therefore, you need to pass the output through a “Sigmoid” layer to convert it to a map that has value ranges between 0 and 1 (similar to the range of your y-label 0:background and 1:segmentation-mask). With that being said, BCEWithLogitsLoss() is a natural choice for your application because it applies a Sigmoid function to the output before calculating cross entropy loss.

Hope that help.