Cross Entropy Loss delivers wrong classes

titoniubo · January 16, 2020, 1:49pm

Hello,

When using torch.argmax(output, dim=1) to see the predicted classes, I get to see the values 0, 1, 2 when the expected ones are 1,2,3.

I assume there may be an when implementing my code. It’s a multi-class prediction, with an input of 10 variables to predict a target (y). The target has 3 class: 1,2 and 3.

I would appreciate if someone could have a look and let me know what I may be doing wrong.

Here’s my code:

class NeuralNet(nn.Sequential):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.layer1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(hidden_size, hidden_size)
        self.relu = nn.ReLU()
        self.layer3 = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        out = self.layer1(x)
        out = self.relu(out)
        out = self.layer2(out)
        out = self.relu(out)
        out = self.layer3(out)
        return out

the training

vae = NeuralNet(10,6,3)

#device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

#vae.to(device)

loss_function = nn.CrossEntropyLoss()

optimizer = torch.optim.SGD(vae.parameters(), lr = 0.005)

train_y -= 1
test_y -= 1

def train(epoch):
    vae.train()
    train_loss = 0

    for batch_idx, (data,label) in enumerate(train_loader_X):
        #data= data.to(device)
        #label= label.to(device)

        optimizer.zero_grad()

        out = vae(data)
        loss = loss_function(out, label)

        loss.backward()
        train_loss += loss.item()
        optimizer.step()

        if batch_idx % 500 == 0:
            #data= data.to(device)    
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader_X.dataset),
                100. * batch_idx / len(train_loader_X), loss.item() / len(data)))
    print('====> Epoch: {} Average loss: {:.4f}'.format(epoch, train_loss / len(train_loader_X.dataset)))

and the prediction

output = vae(test)
predicted = torch.argmax(output,dim =1)
predicted

As a result I obtain this:

tensor([2, 1, 0, …, 1, 1, 0])

As mentioned, expected values in above tensor would be 1, 2, 3.

I would be grateful if anyone could shed some light on this.

beaupreda · January 16, 2020, 4:20pm

Hello,

This is because indexing in PyTorch is zero based. You simply have to map the predicted indices (0, 1, 2) to your classes (1, 2, 3). If you think about it, the classes you want to predict could be cat, dog and fish and torch.argmax(output, dim=1) would still give you numbers 0, 1, 2.

Let me know if it is not clear!

titoniubo · January 16, 2020, 4:26pm

@beaupreda, I understand your point and thank you for the explanation.

I would appreciate if you could kindly guide me on how to map the predicted indices (0,1,3) to the expected classes (1,2,3)

Sincerely

beaupreda · January 16, 2020, 4:32pm

I forgot about that, but if you look at the CrossEntropyLoss docs, it expects for the target tensor to be in range [0, C - 1], where C is the number of classes you want to predict (3 in your case). So what you have to do is in your preprocessing of data, map your classes (1, 2, 3) so they fit in the range (0, 1, 2). In your case, you could simply substract 1 to each class so the target tensor is in the correct range. Afterwards, simply add 1 to the predicted classes to get back to the (1, 2, 3) classes.

titoniubo · January 17, 2020, 9:41am

@beaupreda, thanks for your feedback.

Following your feedback I proceed to change the values from Y (train and test) before these variables would be converted into tensor:

data.damage_grade.replace([1,2,3],[0,1,2], inplace=True)

I also tried following this [post] the(How to represent class_to_idx map for custom dataset in Pytorch) the following code, but I guess I do not know where to link it to.

idx_to_class = {
    ‘1’: “0”,
    ‘2’: “1”,
    ‘3’: “2”,
    }
sample_class = idx_to_class[label+1]

Unfortunately though, with either option, when I run the whole code the model would not train and return the following error:

RuntimeError: Assertion cur_target >= 0 && cur_target < n_classes' failed. at /pytorch/aten/src/THNN/generic/ClassNLLCriterion.c:97

I’d appreciate if you could let me know what your thoughts are.

beaupreda · January 17, 2020, 2:49pm

Could you post a small example of your dataset so I could try to execute your code?

Eta_C · January 18, 2020, 1:47am

Hi

In my opinion, you do not need to create a mapping from index to label. Since your label is a tensor, just simply change your code to

for batch_idx, (data, label) in enumerate(train_loader_X):
    ...
    loss = loss_function(out, label - 1)
   ...

and

output = vae(test)
predicted = torch.argmax(output, dim=1) + 1

titoniubo · January 21, 2020, 10:09am

@beaupreda, thanks for your help. Sorry I have no time to prepare this. Your suggestion to remmap helped me understand what the issue was and fix using remapping.

titoniubo · January 21, 2020, 10:10am

@Eta_C many thanks for your valuable feedback.

Your proposed solution works.