Question about weight in CrossEntropyLoss

Hi Philipp!

This is correct (if I understand what you are saying).

The “mean reduction” computes a (conventional) weighted average,
that is, it does divide by the sum of the weights.

This makes sense to me because if, by happenstance, all of the
samples in the batch have the same loss, loss_all, I would like the
mean reduction over that batch also to give a batch mean of loss_all.
The conventional weighted average does this.

These two threads (about NLLLoss, but its the same issue) give some
additional words of explanation:

Best.

K. Frank