CrossEntropyLoss Shape Mismatch

Hello everyone, quick CrossEntropyLoss shape mismatch question for the group.

Basic example:
loss = nn.CrossEntropyLoss()
input = torch.randn(3, 4, 9)
target = torch.empty(3, 4, dtype=torch.long).random_(5)
output = loss(input, target)

ValueError: Expected target size (3, 9), got torch.Size([3, 4])

Should I reshape the input or target here and what would be the best method?

For reference the real toy data set is 3 batches of 4 letters which are 1 hot encoded (9 values/letters):
inputs = Variable(torch.Tensor([[h, e, l, l],
[e, l, l, o],
[l, l, o, h]]))

According to the documentation, the CrossEntropyLoss expects the index of the targets. In your case, the target shape should be 3, maybe you should just run output = loss(input, target.argmax(1)).

1 Like