# Weight in cross entropy loss

I was trying to understand how `weight` is in `CrossEntropyLoss` works by a practical example. So I first run as standard PyTorch code and then manually both. But the losses are not the same.

``````from torch import nn
import torch
softmax=nn.Softmax()
sc=torch.tensor([0.4,0.36])
loss = nn.CrossEntropyLoss(weight=sc)
input = torch.tensor([[3.0,4.0],[6.0,9.0]])
target = torch.tensor([1,0])
output = loss(input, target)
print(output)
>>1.7529
``````

Now for manual Calculation, first softmax the `input`:

``````print(softmax(input))
>>
tensor([[0.2689, 0.7311],
[0.0474, 0.9526]])
``````

and then negetive log of the correct class probality and multiply with the respective weight:
`((-math.log(0.7311)*0.36) - (math.log(0.0474)*0.4))/2`

0.6662

What I am missing here?

Hello Mainul!

When using `CrossEntropyLoss (weight = sc)` with class weights
to perform the default `reduction = 'mean'`, the average loss that
is calculated is the weighted average. That is, you should be dividing
by the sum of the weights used for the samples, rather than by the
number of samples.

The following (pytorch version 0.3.0) script illustrates this:

``````import torch
torch.__version__

sc = torch.FloatTensor ([0.4,0.36])
loss = torch.nn.CrossEntropyLoss (weight = sc)
output = loss (input, target)
print (output)

probs = torch.nn.Softmax (dim = 1) (input)
output2 = -(torch.log (probs[0, 1]) * sc + torch.log (probs[1, 0]) * sc) / (sc + sc)
print (output2)
print (((sc + sc) / 2.0) * output2)
``````

Here is the output:

``````>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> sc = torch.FloatTensor ([0.4,0.36])
>>> loss = torch.nn.CrossEntropyLoss (weight = sc)
>>> input = torch.autograd.Variable (torch.FloatTensor ([[3.0,4.0],[6.0,9.0]]))
>>> target = torch.autograd.Variable (torch.LongTensor ([1,0]))
>>> output = loss (input, target)
>>> print (output)
Variable containing:
1.7529
[torch.FloatTensor of size 1]

>>>
>>> probs = torch.nn.Softmax (dim = 1) (input)
>>> output2 = -(torch.log (probs[0, 1]) * sc + torch.log (probs[1, 0]) * sc) / (sc + sc)
>>> print (output2)
Variable containing:
1.7529
[torch.FloatTensor of size 1]

>>> print (((sc + sc) / 2.0) * output2)
Variable containing:
0.6661
[torch.FloatTensor of size 1]
``````

You can see that see the the (weighted) `CrossEntropyLoss` and
“manual” results now match. And at the end we recover your manual
result by undoing the division by the sum of the weights.

Best.

K. Frank

2 Likes