Hey, I’m trying to reproduce
CrossEntropyLoss implementation (in order to change it later for my needs), and currently I’m not able to match the results when non-uniform weights are provided and
size_average is set to
True (but if weights are uniform and/or
False - results match, at least their printed representations).
I tried to follow formula in pytorch reference, but it seems that either I’m missing something or the weights are applied slightly differently (or maybe I have a bug of course )
Here’s my implementation:
import torch num_classes = 5 num_samples = 3 wts = torch.abs(torch.randn(num_classes)) wts /= torch.sum(wts) weights = torch.autograd.Variable(wts) # weights = torch.autograd.Variable(torch.ones(num_classes)) input = torch.autograd.Variable(torch.randn(num_samples, num_classes)) target = torch.autograd.Variable(torch.LongTensor(num_samples).random_(num_classes)) # todo: check why size_average changes weight contribution loss = torch.nn.CrossEntropyLoss(weight=weights, size_average=False) output = loss(input, target) correct_confidences = torch.exp(input[range(num_samples), target]) total_confidences = torch.sum(torch.exp(input), dim=1) p_t = correct_confidences/total_confidences CE = torch.sum(weights.index_select(0, target)*(-torch.log(p_t))) print 'torch CE =', output print 'manual CE =', CE
Example of output:
torch CE = Variable containing: 1.8163 [torch.FloatTensor of size 1] manual CE = Variable containing: 1.8163 [torch.FloatTensor of size 1]
If I change
True and replace
CE computation here’s an example of what I get:
torch CE = Variable containing: 1.6775 [torch.FloatTensor of size 1] manual CE = Variable containing: 0.3721 [torch.FloatTensor of size 1]
I know that dividing by
total_confidences is not the best idea, but I guess it’s not the main issue here. Also since I’m a newbie at pytorch some places might look odd - feel free to point those out.