For a problem I have used CrossEntropyLoss as criteria to evaluate performance of a neural network. In order to know details of this function i visits this page http://pytorch.org/docs/master/_modules/torch/nn/modules/loss.html#CrossEntropyLoss. Here the CrossEntropyLoss is defined using the F.cross_entropy function where F is declared as from … import functional as F. I’m unable to find the source code of F.cross_entropy function. Does anybody know the details of this function.
I saw this link. Particularly I’m interested in implementation of F.cross_entropy function.
I assume you read the definition of the
cross_entropy function in that file.
def cross_entropy(input, target, weight=None, size_average=True, ignore_index=-100, reduce=True): return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
Which bits did you not understand?
What is the physical significance of ignore_index. And one more thing that i want to know that the range of this CrossEntrpyLoss function. Will it be always in 0 and 1.
From the docs
ignore_index (int, optional) – Specifies a target value that is ignored and does not contribute to the input gradient. When
True, the loss is averaged over non-ignored targets.
Also from the docs the formula for CrossEntropyLoss is
loss(x, class) = -log(exp(x[class]) / (\sum_j exp(x[j])))
Now some basic math
exp(x[class])is always positive
\sum_j exp(x[j])is always greater than
exp(x[class]) / (\sum_j exp(x[j]))is always in the range [0, 1]
log(anything in the range [0, 1])is in the range (-inf, 0]
-log(exp(x[class]) / (\sum_j exp(x[j]))) is in the range [0, +inf)
Thanks for explaining me.