Details of torch.nn.CrossEntropyLoss

For a problem I have used CrossEntropyLoss as criteria to evaluate performance of a neural network. In order to know details of this function i visits this page Here the CrossEntropyLoss is defined using the F.cross_entropy function where F is declared as from … import functional as F. I’m unable to find the source code of F.cross_entropy function. Does anybody know the details of this function.

I saw this link. Particularly I’m interested in implementation of F.cross_entropy function.

1 Like

I assume you read the definition of the cross_entropy function in that file.

def cross_entropy(input, target, weight=None, size_average=True, ignore_index=-100, reduce=True):
    return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)

Which bits did you not understand?

What is the physical significance of ignore_index. And one more thing that i want to know that the range of this CrossEntrpyLoss function. Will it be always in 0 and 1.

From the docs
ignore_index (int, optional) – Specifies a target value that is ignored and does not contribute to the input gradient. When size_average is True, the loss is averaged over non-ignored targets.

Also from the docs the formula for CrossEntropyLoss is
loss(x, class) = -log(exp(x[class]) / (\sum_j exp(x[j])))

Now some basic math

  • exp(x[class]) is always positive
  • \sum_j exp(x[j]) is always greater than exp(x[class])
  • so exp(x[class]) / (\sum_j exp(x[j])) is always in the range [0, 1]
  • log(anything in the range [0, 1]) is in the range (-inf, 0]

Hence -log(exp(x[class]) / (\sum_j exp(x[j]))) is in the range [0, +inf)

1 Like

Thanks for explaining me.

1 Like