You are using it correctly!
However, I think there is an explanation missing on how size_average
works regarding the weight in the docs.
Have a look at the docs of NLLLoss. It states, that each loss will be divided by the sum of all corresponding class weights, if reduce=True
and size_average=True
.
In your case, since you just have one example, the loss will by divided by 10, which yields exactly the same result as the unweighted loss:
x = torch.Tensor([[1.0,2.0,3.0]])
y = torch.LongTensor([1])
w = torch.Tensor([1.0,1.0,1.0])
F.cross_entropy(x,y,w, size_average=False)
w = torch.Tensor([1.0,10.0,1.0])
F.cross_entropy(x,y,w, size_average=False)
F.cross_entropy(x,y,w, size_average=True)
loss = 10.0 * (-x[0, y] + torch.log(torch.sum(torch.exp(x))))
averaged_loss = loss / w[y]
If you create another sample, the loss will differ:
x = torch.Tensor([[1.0,2.0,3.0], [2.0, 1.0, 3.0]])
y = torch.LongTensor([1, 1])
w = torch.Tensor([1.0,1.0,1.0])
F.cross_entropy(x,y,w, size_average=False)
> tensor(3.8152)
w = torch.Tensor([1.0,10.0,1.0])
F.cross_entropy(x,y,w, size_average=False)
> tensor(38.1521)
F.cross_entropy(x,y,w, size_average=True)
> tensor(1.9076)
In this example 38.1521
will be divided by the sum of the corresponding weights (w[1]
for each sample), thus 38.1521/20. = 1.9076
.