CrossEntropyLoss and OneHot classes

I’m having some trouble understanding CrossEntropyLoss as it relates to one_hot encoded classes. The docs use random numbers for the values, so to better understand I created a set of values and targets which I expect to show zero loss…

I have 5 classes, and 5 one_hot encoded vectors (1 for each class), I then provide a target index corresponding to each class.

I’m using reduction=‘none’ to show the values for each sample loss.

loss_fn = torch.nn.CrossEntropyLoss( reduction='none')
x = torch.tensor( [[ 1.0, 0.0, 0.0, 0.0, 0.0], 
                   [ 0.0, 1.0, 0.0, 0.0, 0.0], 
                   [ 0.0, 0.0, 1.0, 0.0, 0.0], 
                   [ 0.0, 0.0, 0.0, 1.0, 0.0], 
                   [ 0.0, 0.0, 0.0, 0.0, 1.0]], requires_grad=False).to( 'cpu')
y = torch.tensor( [ 0, 1, 2, 3, 4], requires_grad=False).type( torch.long).to( 'cpu')
loss = loss_fn( x, y)
print( loss)

results…

tensor([0.9048, 0.9048, 0.9048, 0.9048, 0.9048])

I really can’t follow the logic here. Could someone please elaborate?

CrossEntropyLoss takes unnormalized scores (simetimes called “logits”) as inputs.
So if you want x tensor to represent the “perfect predictions”, you would could take the log of what you have (send the 0s to -inf and use 0 (anything finite woudl do) for the 1s).

Best regards

Thomas

1 Like

That’s very helpful. Thank you. A bit more for me to wrap my head around, but those number make some sense now at least.