Since I checked the doc and the explanation from weights in CE
But When I was checking it for more than two samples, it is showing different results as below
For below snippet
inp = tensor([[0.9860, 0.1934],
[0.9590, 0.3538],
[0.1502, 0.9544],
[0.7666, 0.0535],
[0.1600, 0.3133],
[0.1827, 0.8578],
[0.2727, 0.7105],
[0.3965, 0.0156]])
target = tensor([1, 1, 1, 0, 0, 0, 1, 1])
cl_wts = 1./torch.tensor([5., 3.])
loss = nn.CrossEntropyLoss()
loss_weighted = loss = nn.CrossEntropyLoss(weight = cl_wts)
l1 = loss(inp, target)
print(l1)---> tensor(0.7793)
l_wt = loss_weighted(inp, target)
print(l_wt) ---> tensor(0.7839)
When I was checking it manually as
logits = Softmax(inp)
:point_down:
logits = tensor([[0.6884, 0.3116],
[0.6469, 0.3531],
[0.3091, 0.6909],
[0.6711, 0.3289],
[0.4617, 0.5383],
[0.3374, 0.6626],
[0.3923, 0.6077],
[0.5941, 0.4059]])
manual_loss1 = -(np.log(0.3116) + np.log(0.3531) + np.log(0.6909) + np.log(0.6711) + np.log(0.4617) + np.log(0.3374) + np.log(0.6077) + np.log(0.4059))
manual_loss = manual_loss/8 --->8 is because of mini batch size
print(manual_loss) ---> 0.7793355874570308, which is equivalent to l1
However for weighted
man_loss_weighted = -(np.log(0.3116)*0.2 + np.log(0.3531)*0.2 + np.log(0.6909)*0.2 + np.log(0.6711)*0.33 + np.log(0.4617)*0.33 + np.log(0.3374)*0.33 + np.log(0.6077)*0.2 + np.log(0.4059)*0.2)/(0.2+0.33)
man_loss_weighted /=8
print(man_loss_weighted)---> 0.3633250361678566
Which is not equivalent to l2 weighted loss,
How is it being computed. Any help would be appreciated
Thank you