Question about CrossEntopyLoss

sahiluppal2k · June 8, 2020, 7:44am

I have my logits of shape (16, 100, 1024), where 16 is batch size, 100 is sequence length, and 1024 is feature length.
My target is of shape (16, 100), where 16 is batch size and 100 is sequence length

while applying nn.CrossEntropyLoss on the both i get the following error
criterion(pred, target)
#ValueError: Expected target size (16, 1024), got torch.Size([16, 100])

wheres this is working with this snipped
criterion(pred[0], target[0])

what am i doing wrong?

ptrblck · June 8, 2020, 9:06am

I assume feature_length=1024 corresponds to the number of classes in the logit tensor?
nn.CrossEntropyLoss expects the output as [batch_size, nb_classes, seq_length] and the target as [batch_size, seq_length]. Based on the target dimension, I assume 1024 is the number of classes, so you would need to permute this tensor via:

logits = logits.permute(0, 2, 1)