This is a conceptual question on Loss Functions, I was trying to understand the scenarios where I should use a BCEWithLogitsLoss over CrossEntropyLoss. (Apologies if this is a too naive question to ask )

I am currently working on an Image Segmentation project where I intend to use UNET model. The paper quotes “The energy function is computed by a pixel-wise soft-max over the final feature map combined with the cross entropy loss function”, and going by the pytorch documentation it seems this loss is similar to BCEWithLogitsLoss. Any guidance would be really helpful.

If I understand correctly - BCEWithLogitsLoss() may be more appropriate for your problem.

Assuming that you did not applied any activation function at your last conv-1x1-layer. Therefore, you need to pass the output through a “Sigmoid” layer to convert it to a map that has value ranges between 0 and 1 (similar to the range of your y-label 0:background and 1:segmentation-mask). With that being said, BCEWithLogitsLoss() is a natural choice for your application because it applies a Sigmoid function to the output before calculating cross entropy loss.