as i know nn.CrossEntropyLoss equal to nn.logsoftmax + nn.NLLLoss
but the nn.CrossEntropyLoss seems like have a function with one hot
so if i want to use nn.logsoftmax + nn.NLLLoss to replace nn.CrossEntropyLoss
should i write a function def onehot?
Just try it.
import torch
import torch.nn as nn
ce = nn.CrossEntropyLoss()
ls = nn.LogSoftmax(dim=-1)
nll = nn.NLLLoss()
batch_size = 5
num_classes = 8
x = torch.rand(batch_size, num_classes)
y = torch.randint(num_classes, (batch_size,))
print(ce(x, y))
print(nll(ls(x), y))
PS: Here is one_hot API nn.functional.one_hot.
thanku for your help
the print(ce(x, y)) 's answer is equal to
print(nll(ls(x), y)) 's
but i get a new trouble
print(ce(ls(x),y)) is also get a same answer
it is seems like the logsoftmax is invalid?
import torch
import torch.nn.functional as F
x = torch.rand(1, 2, 3, 4)
ls = F.log_softmax(x)
lsls = F.log_softmax(F.log_softmax(x))
print((lsls - ls).abs().max())
in my forward network
the last one is x=nn.linear(xxxx,classnums)
i use the loss=crossentropyloss(x,target)
if after x=nn.linear(xxxx,classnums)
follow x=F.log_softmax(x)
loss1=nn.crossentropyloss(x,target)
loss2=nn.NLLLoss(x,target)
seems loss1=loss2?
thats strange…
Since LogSoftmax
is idempotent, you’ll get the same output as shown by @Eta_C’s example.
Internally nn.CrossEntropyLoss
will apply another F.log_softmax
on the inputs.
However, I would recommend to stick to:
-
nn.LogSoftmax
+nn.NLLLoss
or - raw logits +
nn.CrossentropyLoss
as you won’t get any benefit from it using these loss functions.
Although LogSoftmax
is idempotent, using it twice will still produce floating point precision error.