# Weights in NllLoss behave unexpectedly

Hello,

I’m having trouble understanding behaviour of class weights in CrossEntropyLoss.

Specifically, when reduction=‘mean’. I test it like this:

m = nn.LogSoftmax(dim=1)
mi = m(input)

target = torch.tensor([0, 0, 1, 1, 1])
w = torch.tensor([1, 100]).float()

Now, without weights everything behaves reasonably, next 2 lines give the same result

F.nll_loss(mi, target, reduction=‘none’).mean()
F.nll_loss(mi, target, reduction=‘mean’)

However, once we introduce weight, results change

F.nll_loss(mi, target, weight=w, reduction=‘none’).mean()
F.nll_loss(mi, target, weight=w, reduction=‘mean’)

That is extremely unintuitive to me, what is the logic behind this?
The formula in documentation (https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html) is no help, because it reuses variable name “n”, making it seem as if weights are the same for all samples. Are they?

I’ll apreciate any help

Hello Awesome!

Passing weights to `NLLLoss` (and `CrossEntropyLoss`) gives, with
`reduction = 'mean'`, a weighted average where the sum of weighted
values is then divided by the sum of the weights.

In your `reduction = ‘none’ version:

``````F.nll_loss(mi, target, weight=w, reduction=‘none’).mean()
``````

by the time you get to the call to pytorch’s tensor `.mean()`, the weights
are no longer available, so `.mean()` cannot divide by the sum of the
weights.

An example script illustrating this can be found in this post:

Best.

K. Frank

1 Like

Hi KFrank,

Thank you, I get it now. Seems kinda weird to me, but I also understand the reasoning behind this.