nn.CrossEntropyLoss() for text with multiple dimension


My logits are of dimention: torch.Size([64, 26, 7900])
My target is of dimension: torch.Size([64, 26])

It is so because the output is from LSTM for some NLP task. 7900 is the size of the vocabulary.

How do I formulate the loss for this scenario?

loss = nn.CrossEntropyLoss()
input = torch.randn(64, 26, 7900, requires_grad=True)
target = torch.empty(64,26, dtype=torch.long).random_(5)
output = loss(input, target)

throws the error:

ValueError: Expected target size (64, 7900), got torch.Size([64, 26])

The size of logits must be [64,7900,26].

1 Like

As Chetan explained, the model output tensor should contain the class indices in dim1 and additional dimensions afterwards.
Generally, nn.CrossEntropyLoss expects these shapes:

  • output: [batch_size, nb_classes, *]
  • target [batch_size, *]
1 Like