Weird Behavior in CrossEntropyLoss

jsager · February 5, 2019, 5:11pm

The ‘reduction’ parameter of CrossEntropyLoss seems to ignore weights in its default value of ‘mean’ when the batch size is 1.

example code:

class_weights = torch.FloatTensor([.24, .60, 1.0, .46, .89])
criterion=nn.CrossEntropyLoss(weight=class_weights, reduction='none')
a = Variable(torch.LongTensor([1]))
b = Variable(torch.FloatTensor([[.8, .1, 0, .1, 0]]))
print(a)
print(b)
loss = criterion(b, a)
print(loss)

When you change reduction to ‘mean’, the value for loss changes. Oddly, it changes to the loss value you get if you don’t include any weights and keep reduction at ‘none’. From my understanding of the reduction parameter, it shouldn’t even matter if CEL is computed on just two tensors. Bug?