BCELoss vs BCEWithLogitsLoss

I understand the differences in the implementation, I don’t understand the theoretical advantages of using BCE with sigmoid vs without sigmoid.