Using BCEWithLogitsLoss in training and BCELoss for test

Hi,

I have a binary model. I used the Sigmoid activation function and BCELoss for loss function. when I normalized my vector(input), I did not get good prediction (the prediction dropped but without normalization it works properly). So I remove sigmoid from my network and instead use BCEWithLogitsLoss in training phase but keep BCELoss for the test. I am getting this error in test phase : 'all elements of input should be between 0 and '. I know why it is happening, becuase the prediction in test is not between 0 - 1. My question is why I am getting prediction in test out of 0-1 range? should I use BCEWithLogitsLoss for test as well? if so, I do not understand why.

Thanks

Simply because without the sigmoid activation your model will give you logits that are not guaranteed to be bounded between 0 and 1.

As the name implies BCEWithLogitsLoss can compute binary cross-entropy from the raw logits while the BCELoss needs a binary Tensor as mentioned in the docs (BCELoss — PyTorch 2.1 documentation)

See past discussion here: BCELoss vs BCEWithLogitsLoss

So there are two options:

  1. model(input) → logits → BCEWithLogitsLoss → loss
  2. model(input) → logits → F.sigmoidBCELoss → loss

I would recommend using the same steps during both training and test to avoid discrepancies