Hi macazinc!
Here’s what I think you’re asking:
You have a multi-class (30 classes) classification problem. You know
that for most of your classes, your ground-truth target
labels are
correct, but your labels sometimes mix up two of your classes, say 4
and 9
, Let’s say that a sample labelled 4
is actually a 9
25% of the
time, and that a sample labelled 9
is actually a 4
10% of the time.
You will (most likely) want to use cross-entropy loss, but pytorch only
provides a version that takes integer categorical class labels for its
target
. In your case, you want what I call soft labels, and will have
to write your own soft-label version of cross-entropy. See this post
for an implementation:
Now let me assume that your (sometimes incorrect) target
labels
are given as integer categorical labels. First use one_hot() (followed
by float()) to convert your categorical labels into soft labels (that all
happen to be zero or one). Then whenever a sample is labelled 4
(target[i, 4] == 1.0
), set target[i, 4] = 0.75
and
target[i, 9] = 0.25
. Similarly, when target[i, 9] == 1.0
, set
target[i, 9] = 0.90
, and target[i, 4] = 0.10
.
You then feed the soft-label target
you constructed into the soft-label
cross-entropy you implemented yourself.
Best.
K. Frank