Hi,
For a problem I have used CrossEntropyLoss as criteria to evaluate performance of a neural network. In order to know details of this function i visits this page http://pytorch.org/docs/master/_modules/torch/nn/modules/loss.html#CrossEntropyLoss. Here the CrossEntropyLoss is defined using the F.cross_entropy function where F is declared as from … import functional as F. I’m unable to find the source code of F.cross_entropy function. Does anybody know the details of this function.
I saw this link. Particularly I’m interested in implementation of F.cross_entropy function.
I assume you read the definition of the cross_entropy
function in that file.
def cross_entropy(input, target, weight=None, size_average=True, ignore_index=-100, reduce=True):
return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
Which bits did you not understand?
What is the physical significance of ignore_index. And one more thing that i want to know that the range of this CrossEntrpyLoss function. Will it be always in 0 and 1.
From the docs
ignore_index (int, optional)
– Specifies a target value that is ignored and does not contribute to the input gradient. When size_average
is True
, the loss is averaged over non-ignored targets.
Also from the docs the formula for CrossEntropyLoss is
loss(x, class) = -log(exp(x[class]) / (\sum_j exp(x[j])))
Now some basic math
-
exp(x[class])
is always positive -
\sum_j exp(x[j])
is always greater thanexp(x[class])
- so
exp(x[class]) / (\sum_j exp(x[j]))
is always in the range [0, 1] -
log(anything in the range [0, 1])
is in the range (-inf, 0]
Hence -log(exp(x[class]) / (\sum_j exp(x[j])))
is in the range [0, +inf)
Thanks for explaining me.