Hello Alex and Doosti!

For case of binary, BCELoss is a good choice.
Just to clarify something, for a binary-classification problem, you
are best off using the logits that come out of a final Linear
layer,
with no threshold or Sigmoid
activation, and feed them into
BCEWithLogitsLoss
. (Using Sigmoid
and BCELoss
is less
numerically stable.)
And, as Doosti recommended, your last layer should have a single
output, rather than 2. Thus:
nn.Linear(num_ftrs, 1))
Best.
K. Frank