# NLLLoss vs CrossEntropyLoss

I’m comparing the results of NLLLoss and CrossEntropyLoss and I don’t understand why the loss for NLLLoss is negative compared to CrossEntropyLoss with the same inputs.

``````import torch.nn as nn
import torch

label = torch.tensor([3, 0, 1, 1, 4])
output = torch.tensor([[0.5073, 0.4838, 0.5053, 0.4839, 0.5183],
[0.5072, 0.4849, 0.4933, 0.4809, 0.5148],
[0.5020, 0.4836, 0.5021, 0.4829, 0.5162],
[0.5023, 0.4801, 0.4994, 0.4805, 0.5174],
[0.5024, 0.4899, 0.4932, 0.4835, 0.5148]])

criterion = nn.NLLLoss()
loss = criterion(output, label)
loss
tensor(-0.4939)

criterion = nn.CrossEntropyLoss()
loss = criterion(output, label)
loss
tensor(1.6128)
``````

`CrossEntropyLoss` applies `LogSoftmax` to the output before passing it to `NLLLoss`.
This snippet shows how to get equal results:

``````nll_loss = nn.NLLLoss()
log_softmax = nn.LogSoftmax(dim=1)
print(nll_loss(log_softmax(output), label))

cross_entropy_loss = nn.CrossEntropyLoss()
print(cross_entropy_loss(output, label))
``````
2 Likes

Hello,
Is there a difference in terms of running time or accuracy of using CrossEntropyLoss vs. LogSoftmax + NLLLoss (on CPU or GPU)?
Which option is considered more conventional / recommended?
Thanks

`nn.CrossEntropyLoss` uses `F.log_softmax` and `F.nll_loss` internally, so there wouldn’t be a difference in using the latter ops explicitly.