nn.CrossEntropyLoss() for text with multiple dimension

VikasRajashekar · July 27, 2020, 9:01am

Hello,

My logits are of dimention: torch.Size([64, 26, 7900])
My target is of dimension: torch.Size([64, 26])

It is so because the output is from LSTM for some NLP task. 7900 is the size of the vocabulary.

How do I formulate the loss for this scenario?

loss = nn.CrossEntropyLoss()
input = torch.randn(64, 26, 7900, requires_grad=True)
target = torch.empty(64,26, dtype=torch.long).random_(5)
output = loss(input, target)
output.backward()

throws the error:

ValueError: Expected target size (64, 7900), got torch.Size([64, 26])

chetan_patil · July 27, 2020, 11:20am

The size of logits must be [64,7900,26].

ptrblck · July 28, 2020, 9:20am

As Chetan explained, the model output tensor should contain the class indices in dim1 and additional dimensions afterwards.
Generally, nn.CrossEntropyLoss expects these shapes:

output: [batch_size, nb_classes, *]
target [batch_size, *]