Math in crossentropy loss document


I am reading this doc about crossentropy loss CrossEntropyLoss — PyTorch 2.1 documentation

There are parts I don’t understand

How should I read this part? and What is it??

These might be dump question. I am a beginner so please help me.

also I don’t get why yn is in subscription in this

Hi Pakpoom!

This is probably not the best (nor particularly standard) notation.

As a practical matter, it means that you leave out (“ignore”) any terms for
which the target class, y_n, has the ignore_index value.

More mathematically, you should take 1{y_n != ignore_index} to mean
0 if y_n == ignore_index and 1 if y_n != ignore_index.

Let me also answer the question in your second post here:

n is the index of the sample within the batch you are processing. y_n is the
ground-truth target integer class label for that sample.

CrossEntropyLoss let’s you assign different weights to different classes.
w_(y_n) is the weight of the class with which sample n has been labelled.

Inside of the exp(), x_{n, y_n} goes like this. x is the so-called input to
CrossEntropyLoss. That is, it is (typically) the prediction made by your model.
x_n is the prediction for the nth sample in the batch. But your prediction is a
set of (unnormalized) log-probabilities for each of your classes. So x_{n, y_n}
is the predicted log-probability for sample n for the class y_n, which is the
ground-truth class label assigned to sample n.


K. Frank

1 Like