Hello Alex and Doosti!
For case of binary, BCELoss is a good choice.
Just to clarify something, for a binary-classification problem, you
are best off using the logits that come out of a final Linear layer,
with no threshold or Sigmoid activation, and feed them into
BCEWithLogitsLoss. (Using Sigmoid and BCELoss is less
numerically stable.)
And, as Doosti recommended, your last layer should have a single
output, rather than 2. Thus:
nn.Linear(num_ftrs, 1))
Best.
K. Frank