Hi Bram!
Two comments:
First, as you’ve seen, BCEWithLogitsLoss
requires its target
to
be a float
tensor, not long
(or a double
tensor, if the input
is
double
). And yes, converting to float
(labels.float()
) is the
correct solution.
Second, as to why: Unlike pytorch’s CrossEntropyLoss
,
BCEWithLogitsLoss
supports labels that are probabilities (sometimes
called “soft” labels). Thus, a label could be 0.333
. This would indicate
that the sample has a 33.3% of being in the “1”-class (or “yes”-class)
and therefore a 66.7% change of being in the “0”-class (“no”-class).
So this is probably a “no”, but a value of 0.00
would be a “hard”
(non-probabilistic, fully-certain) “no.”
(Note that probabilistic labels make perfect sense for cross entropy,
as well. It’s just that pytorch’s CrossEntropyLoss
chooses not to
support them (although perhaps it should). Doing so would require
CrossEntropyLoss
to accept a target
with a different shape, namely,
with a class dimension.)
Best.
K. Frank