The ignore_index value of the cross entropy loss is -100. In the documentation it is mentioned that ignore_index
is only applicable when the target contains class indices. But I would like to use it with class probabilities.
input = torch.randn(3, 5, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(5)
ignore_input = torch.randn(3, 5, requires_grad=True)
ignore_target = torch.Tensor([-100, 2, -100]).to(torch.long)
r = loss(input, target)
r_ignore = loss(ignore_input, ignore_target)
r.backward()
r_ignore.backward()
print(input.grad)
print(ignore_input.grad)
output:
tensor([[ 0.0117, -0.3307, 0.0511, 0.2001, 0.0678],
[ 0.0515, 0.0019, 0.2177, 0.0161, -0.2874],
[-0.2171, 0.0212, 0.0193, 0.1339, 0.0427]])
tensor([[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0979, 0.1573, -0.8633, 0.5849, 0.0232],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]])
The above code works fine, ignores the target -100, and doesn’t update the weights (because gradient are zero), but the targets here represents the class indices. But I would like to use this approach with class probabilities, i.e when the shape of the input and the target tensors are same. But I am not sure how to do that.
Help me out with this issue
Thanks,
Krishnan Jothi Ramalingam