CrossEntroy backward not correct

andeyeluguo · April 26, 2022, 11:55pm

we can get the grad of output

>>> loss = nn.CrossEntropyLoss()
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.empty(3, dtype=torch.long).random_(5)
>>> output = loss(input, target)
>>> output.backward()
>>> input.grad

the input.grad value is deemed equal to the softmax(input)'s value, and the other is equal to 1 - softmax(input),
but the result above is not.

ps: why the web has no email notification

InnovArul · April 27, 2022, 9:53am

Can you point out the incorrectness by writing the output?

In my observation, the gradient seems correct.
For the logits z, the target class i and loss L,

dL / dz_i = softmax(z_i) - 1
dL / dz_j = softmax(z_j)