Hi Paulo!
Yes, but you have to read between the lines a little bit.
In short, for the binary (two-class) classification problem, if
you use 0 and 1 as the class labels, such class labels can
be understood as probabilities.
This is at least implicit in the BCEWithLogitsLoss documentation,
where we have the equation (formatted more nicely in the link):
β(x,y)=L={l1β,β¦,lNβ}β€,lnβ=βwnβ[ynββ
logΟ(xnβ)+(1βynβ)β
log(1βΟ(xnβ))]
We have two classes. Understanding yn to be the given,
known probability of one of the two classes, and therefore
1βyn to be the given probability of the other, we recognize
ββ[ynββ
logΟ(xnβ)+(1βynβ)β
log(1βΟ(xnβ))] to be the cross-entropy
for Ο(xnβ) to be the predicted probability for one class (and
therefore (1βΟ(xnβ)) to be the predicted probability of the other).
(xn is the logit for that class predicted by your model and
Ο(xnβ) = sigmoid(xnβ) is the predicted probability for that class.)
So this equation is telling us how we must interpret yn.
In the usual case for labelled training data, yn is the class
label and is therefore equal to 0 or 1. But this agrees
with understanding yn as a probability: yn = 0 means
0% probability of being in class β1β which means 100%
probability of being in class β0β. And yn = 1 means 100%
probability of being in class β1β.
To repeat this with slightly different wording:
yn = Prob (class β1β) = 0 β class β0β, and
yn = Prob (class β1β) = 1 β class β1β.
(Later in the documentation, the yn are referred to as βthe
targets t[i],β a change of notation that doesnβt help matters.)
It would be much more understandable if the documentation
made clear that this is how the class labels enter into
the loss function and gave a concrete example. But drilling
down into the equation for the loss function does tell us
what the class labels have to mean.
I hope that this is what you were looking for and explains
what I was referring to in my earlier post.
Best.
K. Frank