Hi,
It seems you misunderstood the BCEWithLogitsLoss. It uses sigmoid function on its inputs not on outputs. Here is pipeline: x->BCEWithLogitsLoss = x-> sigmoid -> BCELoss (Note that BCELoss is a standalone function in PyTorch too.)
If you look at the documentation of torch.nn.BCEWithLogitsLoss, it says “This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as…”.
Also, this post may help you too.
By the way, sigmoid maps input to [0,1] by default.
bests