Weight in cross entropy loss

Hello Mainul!

When using CrossEntropyLoss (weight = sc) with class weights
to perform the default reduction = 'mean', the average loss that
is calculated is the weighted average. That is, you should be dividing
by the sum of the weights used for the samples, rather than by the
number of samples.

The following (pytorch version 0.3.0) script illustrates this:

import torch
torch.__version__

sc = torch.FloatTensor ([0.4,0.36])
loss = torch.nn.CrossEntropyLoss (weight = sc)
input = torch.autograd.Variable (torch.FloatTensor ([[3.0,4.0],[6.0,9.0]]))
target = torch.autograd.Variable (torch.LongTensor ([1,0]))
output = loss (input, target)
print (output)

probs = torch.nn.Softmax (dim = 1) (input)
output2 = -(torch.log (probs[0, 1]) * sc[1] + torch.log (probs[1, 0]) * sc[0]) / (sc[0] + sc[1])
print (output2)
print (((sc[0] + sc[1]) / 2.0) * output2)

Here is the output:

>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> sc = torch.FloatTensor ([0.4,0.36])
>>> loss = torch.nn.CrossEntropyLoss (weight = sc)
>>> input = torch.autograd.Variable (torch.FloatTensor ([[3.0,4.0],[6.0,9.0]]))
>>> target = torch.autograd.Variable (torch.LongTensor ([1,0]))
>>> output = loss (input, target)
>>> print (output)
Variable containing:
 1.7529
[torch.FloatTensor of size 1]

>>>
>>> probs = torch.nn.Softmax (dim = 1) (input)
>>> output2 = -(torch.log (probs[0, 1]) * sc[1] + torch.log (probs[1, 0]) * sc[0]) / (sc[0] + sc[1])
>>> print (output2)
Variable containing:
 1.7529
[torch.FloatTensor of size 1]

>>> print (((sc[0] + sc[1]) / 2.0) * output2)
Variable containing:
 0.6661
[torch.FloatTensor of size 1]

You can see that see the the (weighted) CrossEntropyLoss and
“manual” results now match. And at the end we recover your manual
result by undoing the division by the sum of the weights.

Best.

K. Frank

2 Likes