Hello

I am reading this doc about crossentropy loss CrossEntropyLoss — PyTorch 2.1 documentation

There are parts I don’t understand

How should I read this part? and What is it??

These might be dump question. I am a beginner so please help me.

Hello

I am reading this doc about crossentropy loss CrossEntropyLoss — PyTorch 2.1 documentation

There are parts I don’t understand

How should I read this part? and What is it??

These might be dump question. I am a beginner so please help me.

also I don’t get why yn is in subscription in this

Hi Pakpoom!

This is probably not the best (nor particularly standard) notation.

As a practical matter, it means that you leave out (“ignore”) any terms for

which the `target`

class, `y_n`

, has the `ignore_index`

value.

More mathematically, you should take `1{y_n != ignore_index}`

to mean

`0`

if `y_n == ignore_index`

and `1`

if `y_n != ignore_index`

.

Let me also answer the question in your second post here:

`n`

is the index of the sample within the batch you are processing. `y_n`

is the

ground-truth `target`

integer class label for that sample.

`CrossEntropyLoss`

let’s you assign different weights to different classes.

`w_(y_n)`

is the weight of the class with which sample `n`

has been labelled.

Inside of the `exp()`

, `x_{n, y_n}`

goes like this. `x`

is the so-called `input`

to

`CrossEntropyLoss`

. That is, it is (typically) the prediction made by your model.

`x_n`

is the prediction for the `n`

th sample in the batch. But your prediction is a

set of (unnormalized) log-probabilities for each of your classes. So `x_{n, y_n}`

is the predicted log-probability for sample `n`

for the class `y_n`

, which is the

ground-truth class label assigned to sample `n`

.

Best.

K. Frank

1 Like