CrossEntroy backward not correct

we can get the grad of output

>>> loss = nn.CrossEntropyLoss()
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.empty(3, dtype=torch.long).random_(5)
>>> output = loss(input, target)
>>> output.backward()
>>> input.grad

the input.grad value is deemed equal to the softmax(input)'s value, and the other is equal to 1 - softmax(input),
but the result above is not.

Can you point out the incorrectness by writing the output?

In my observation, the gradient seems correct.
For the logits z, the target class i and loss L,

dL / dz_i = softmax(z_i) - 1
dL / dz_j = softmax(z_j)