Issue with CrossEntropyLoss()

imharjyotbagga · July 5, 2021, 3:54pm

I wanted to perform CrossEntropyLoss() with my custom dataset, for an experiment, but I am not being able to perform the loss operation. My code goes as follows:

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

class Net(nn.Module):
    def __init__(self) -> None:
        super(Net, self).__init__()
        self.layer1 = nn.Linear(2, 10)
        self.layer2 = nn.Linear(10, 1)

    def forward(self, x):
        x = F.relu(self.layer1(x))
        x = (self.layer2(x))
        return x
    
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Net().to(device=device)
loss_fn = nn.CrossEntropyLoss()
learning_rate = 1e-3
epochs = 20
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
inputs = torch.tensor([
    [0.,0.],
    [0.,1.],
    [1.,0.],
    [1.,1.]
]).to(device=device)

targets = torch.tensor([
    [0],
    [1],
    [1],
    [0]
], dtype=torch.float32).to(device=device)

model.train()
for epoch in range(epochs):

    pred_output = model(inputs)
    print(pred_output)
    print(targets.dtype)
    print(pred_output.dtype)
    loss = loss_fn(targets, pred_output)
    print(loss)
    
    # optimizer.zero_grad()
    # loss.backward()
    # optimizer.step()
    print()
    break

with this code snippet, I am getting the following error-

tensor([[0.1445],
        [0.3038],
        [0.1030],
        [0.2709]], device='cuda:0', grad_fn=<AddmmBackward>)
torch.float32
torch.float32
Traceback (most recent call last):
  File ".\main.py", line 58, in <module>
    loss = loss_fn(targets, pred_output)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 1047, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\functional.py", line 2693, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\functional.py", line 2388, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward

As soon as I change the dtype of targets to torch.long by:

targets = torch.tensor([
    [0],
    [1],
    [1],
    [0]
], dtype=torch.long).to(device=device)

I get the following error:

tensor([[0.1445],
        [0.3038],
        [0.1030],
        [0.2709]], device='cuda:0', grad_fn=<AddmmBackward>)
torch.float32
torch.float32
Traceback (most recent call last):
  File ".\main.py", line 58, in <module>
    loss = loss_fn(targets, pred_output)
  File "C:\Users\bagga\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\bagga\anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 1047, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "C:\Users\bagga\anaconda3\lib\site-packages\torch\nn\functional.py", line 2693, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "C:\Users\bagga\anaconda3\lib\site-packages\torch\nn\functional.py", line 2388, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward
 bagga@harjyot-bagga  ~\Documents\GitHub\AI-Playground\Pattern Detection Experiment  main   base 3.8.10 ERROR  python .\main.py
tensor([[0.4545],
        [0.3657],
        [0.3480],
        [0.2857]], device='cuda:0', grad_fn=<AddmmBackward>)
torch.int64
torch.float32
Traceback (most recent call last):
  File ".\main.py", line 57, in <module>
    loss = loss_fn(targets, pred_output)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 1047, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\functional.py", line 2693, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\functional.py", line 1672, in log_softmax
    ret = input.log_softmax(dim)
RuntimeError: "host_softmax" not implemented for 'Long'

What am I doing wrong? And what should be done?

pascal_notsawo · July 5, 2021, 7:33pm

The first thing to note is that you are calling the loss function wrong (CrossEntropyLoss — PyTorch 1.9.0 documentation) :

Its first argument, input, must be the output logit of your model, of shape (N, C), where C is the number of classes and N the batch size (in general)
The second argument, target, must be of shape (N), and its elements must be integers (in the mathematical sense) in the range [0, C-1].
A use case can be found in the documentation.

This being said, you should normally call your loss function as follows:

loss = loss_fn(pred_output, targets)

Second, when I look at your model :

it returns an output of shape (N, 1): what would be the point of using CrossEntropyLoss in this case, for C = 1? With the presence of 1 in target you will have an error like IndexError: Target 1 is out of bounds., because the element of target must go from 0 to C-1 = 0.
also, your target is of dimension (N, 1): this is incompatible as an argument to the loss function used here.
finally, your target is of type float32, it must be of type long.

Here is a code that works (if the output of your model is of dimension C, you can add numbers in [0, C-1] to targets)

# ...
targets = torch.tensor([
    [0],
    [0],
    [0],
    [0]
], dtype=torch.float32).to(device=device).squeeze().long() # squeeze() and long()
# ...
for epoch in range(epochs):
    pred_output = model(inputs) 
    loss = loss_fn(pred_output, targets) 
    print(loss)
# ...