Cross Entropy Loss delivers wrong classes


When using torch.argmax(output, dim=1) to see the predicted classes, I get to see the values 0, 1, 2 when the expected ones are 1,2,3.

I assume there may be an when implementing my code. It’s a multi-class prediction, with an input of 10 variables to predict a target (y). The target has 3 class: 1,2 and 3.

I would appreciate if someone could have a look and let me know what I may be doing wrong.

Here’s my code:

class NeuralNet(nn.Sequential):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.layer1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(hidden_size, hidden_size)
        self.relu = nn.ReLU()
        self.layer3 = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        out = self.layer1(x)
        out = self.relu(out)
        out = self.layer2(out)
        out = self.relu(out)
        out = self.layer3(out)
        return out

the training

vae = NeuralNet(10,6,3)

#device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

loss_function = nn.CrossEntropyLoss()

optimizer = torch.optim.SGD(vae.parameters(), lr = 0.005)

train_y -= 1
test_y -= 1

def train(epoch):
    train_loss = 0

    for batch_idx, (data,label) in enumerate(train_loader_X):


        out = vae(data)
        loss = loss_function(out, label)

        train_loss += loss.item()

        if batch_idx % 500 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader_X.dataset),
                100. * batch_idx / len(train_loader_X), loss.item() / len(data)))
    print('====> Epoch: {} Average loss: {:.4f}'.format(epoch, train_loss / len(train_loader_X.dataset)))

and the prediction

output = vae(test)
predicted = torch.argmax(output,dim =1)

As a result I obtain this:

tensor([2, 1, 0, …, 1, 1, 0])

As mentioned, expected values in above tensor would be 1, 2, 3.

I would be grateful if anyone could shed some light on this.


This is because indexing in PyTorch is zero based. You simply have to map the predicted indices (0, 1, 2) to your classes (1, 2, 3). If you think about it, the classes you want to predict could be cat, dog and fish and torch.argmax(output, dim=1) would still give you numbers 0, 1, 2.

Let me know if it is not clear!

@beaupreda, I understand your point and thank you for the explanation.

I would appreciate if you could kindly guide me on how to map the predicted indices (0,1,3) to the expected classes (1,2,3)


I forgot about that, but if you look at the CrossEntropyLoss docs, it expects for the target tensor to be in range [0, C - 1], where C is the number of classes you want to predict (3 in your case). So what you have to do is in your preprocessing of data, map your classes (1, 2, 3) so they fit in the range (0, 1, 2). In your case, you could simply substract 1 to each class so the target tensor is in the correct range. Afterwards, simply add 1 to the predicted classes to get back to the (1, 2, 3) classes.

1 Like

@beaupreda, thanks for your feedback.

Following your feedback I proceed to change the values from Y (train and test) before these variables would be converted into tensor:

data.damage_grade.replace([1,2,3],[0,1,2], inplace=True)

I also tried following this [post] the(How to represent class_to_idx map for custom dataset in Pytorch) the following code, but I guess I do not know where to link it to.

idx_to_class = {
    ‘1’: “0”,
    ‘2’: “1”,
    ‘3’: “2”,
sample_class = idx_to_class[label+1]

Unfortunately though, with either option, when I run the whole code the model would not train and return the following error:

RuntimeError: Assertion cur_target >= 0 && cur_target < n_classes' failed. at /pytorch/aten/src/THNN/generic/ClassNLLCriterion.c:97

I’d appreciate if you could let me know what your thoughts are.

Could you post a small example of your dataset so I could try to execute your code?


In my opinion, you do not need to create a mapping from index to label. Since your label is a tensor, just simply change your code to

for batch_idx, (data, label) in enumerate(train_loader_X):
    loss = loss_function(out, label - 1)


output = vae(test)
predicted = torch.argmax(output, dim=1) + 1
1 Like

@beaupreda, thanks for your help. Sorry I have no time to prepare this. Your suggestion to remmap helped me understand what the issue was and fix using remapping.

@Eta_C many thanks for your valuable feedback.

Your proposed solution works.