Hello,
I’m having trouble understanding behaviour of class weights in CrossEntropyLoss.
Specifically, when reduction=‘mean’. I test it like this:
input = torch.randn(5, 2, requires_grad=True)
m = nn.LogSoftmax(dim=1)
mi = m(input)target = torch.tensor([0, 0, 1, 1, 1])
w = torch.tensor([1, 100]).float()
Now, without weights everything behaves reasonably, next 2 lines give the same result
F.nll_loss(mi, target, reduction=‘none’).mean()
F.nll_loss(mi, target, reduction=‘mean’)
However, once we introduce weight, results change
F.nll_loss(mi, target, weight=w, reduction=‘none’).mean()
F.nll_loss(mi, target, weight=w, reduction=‘mean’)
That is extremely unintuitive to me, what is the logic behind this?
The formula in documentation (NLLLoss — PyTorch 2.1 documentation) is no help, because it reuses variable name “n”, making it seem as if weights are the same for all samples. Are they?
I’ll apreciate any help