BCELoss for binary pixel-wise segmentation

Hi there,
I’m implementing a UNet for binary segmentation while using Sigmoid and BCELoss. The problem is that after several iterations the network tries to predict very small values per pixel while for some regions it should predict values close to ones (for ground truth mask region). Does it give any intuition about the wrong behavior?

Besides, there exists NLLLoss2d which is used for pixel-wise loss. Currently, I’m simply ignoring this and I’musing MSELoss() directly. Should I use NLLLoss2d with Sigmoid activation layer?