nn.CrossEntropyLoss() and incongruent train and test labels

localh · November 12, 2020, 8:26pm

I am wondering how to deal with incongruent training and test labels with nn.CrossEntropyLoss.

For an ultra minimal example say we have:

logits = torch.tensor([-0.3080, -0.2961]).reshape(1, 2)
y = torch.tensor([0])
y_test = torch.tensor([2])
F.cross_entropy(logits, y)
F.cross_entropy(logits, y_test)  # target is out of bounds

How can I deal with labels found in the test data that never occurred in the training?

KFrank · November 13, 2020, 2:03am

Hi Andrew!

The short answer:

Make sure that the classification model you build has outputs for all
classes in your classification problem.

An important note:

If your test set includes classes (“test labels”) that do not occur in
your training set (“training labels”), it will not be possible for your
model to learn to predict those missing classes correctly.

Some further explanation:

CrossEntropyLoss expects an input (the output of your model)
that has shape [nBatch, nClass], and a target (the labels) that
has shape [nBatch] and whose values are integer class labels that
run from [0, nClass - 1] (inclusive).

In your example your logits (the input) has shape [1, 2].
Therefore this example mimics a two-class classification problem,
so your y_test (the test target) values should be 0 or 1. The value
2 is “out of bounds,” and would be the label for the third class in a
three-class (or more) classification problem. (Your y_test has shape
[1], which is correct.)

To reiterate, if you a performing an nClass classification problem,
you must build a model that has nClass outputs, and your target
(label) values must run from 0 to nClass - 1. Your model can’t
determine from the data, on the fly, how many classes you have.
That has to be baked into your model (and consistent with the
target values).

Best.

K. Frank

localh · November 13, 2020, 2:23am

localh:

logits = torch.tensor([-0.3080, -0.2961]).reshape(1, 2)
y = torch.tensor([0])
y_test = torch.tensor([2])
F.cross_entropy(logits, y)
F.cross_entropy(logits, y_test)  # target is out of bounds

Ah I see – the question seems silly now. I should have just made a tweak and played around a bit more. Thank you for your time and help!