we can get the grad of output
>>> loss = nn.CrossEntropyLoss() >>> input = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(5) >>> output = loss(input, target) >>> output.backward() >>> input.grad
the input.grad value is deemed equal to the softmax(input)'s value, and the other is equal to 1 - softmax(input),
but the result above is not.
ps: why the web has no email notification