About crossentropyloss and nllloss

fupanbo · January 5, 2020, 8:14am

as i know nn.CrossEntropyLoss equal to nn.logsoftmax + nn.NLLLoss
but the nn.CrossEntropyLoss seems like have a function with one hot
so if i want to use nn.logsoftmax + nn.NLLLoss to replace nn.CrossEntropyLoss
should i write a function def onehot?

Eta_C · January 5, 2020, 9:35am

Just try it.

import torch
import torch.nn as nn

ce = nn.CrossEntropyLoss()
ls = nn.LogSoftmax(dim=-1)
nll = nn.NLLLoss() 

batch_size = 5
num_classes = 8
x = torch.rand(batch_size, num_classes)
y = torch.randint(num_classes, (batch_size,))

print(ce(x, y))
print(nll(ls(x), y))

PS: Here is one_hot API nn.functional.one_hot.

fupanbo · January 8, 2020, 12:05pm

thanku for your help
the print(ce(x, y)) 's answer is equal to
print(nll(ls(x), y)) 's
but i get a new trouble
print(ce(ls(x),y)) is also get a same answer
it is seems like the logsoftmax is invalid?

Eta_C · January 8, 2020, 12:18pm

import torch
import torch.nn.functional as F

x = torch.rand(1, 2, 3, 4)
ls = F.log_softmax(x)
lsls = F.log_softmax(F.log_softmax(x))
print((lsls - ls).abs().max())

fupanbo · January 8, 2020, 12:45pm

in my forward network
the last one is x=nn.linear(xxxx,classnums)
i use the loss=crossentropyloss(x,target)

if after x=nn.linear(xxxx,classnums)
follow x=F.log_softmax(x)

loss1=nn.crossentropyloss(x,target)
loss2=nn.NLLLoss（x,target）
seems loss1=loss2?
thats strange…

ptrblck · January 9, 2020, 9:00am

Since LogSoftmax is idempotent, you’ll get the same output as shown by @Eta_C’s example.
Internally nn.CrossEntropyLoss will apply another F.log_softmax on the inputs.

However, I would recommend to stick to:

nn.LogSoftmax + nn.NLLLoss or
raw logits + nn.CrossentropyLoss

as you won’t get any benefit from it using these loss functions.

Eta_C · January 9, 2020, 10:20am

Although LogSoftmax is idempotent, using it twice will still produce floating point precision error.