Hello everyone. I am using a HuggingFace model, to which I pass a couple of sentences. Then I am getting the logits, and using PyTorch’s CrossEntropyLoss, to get the loss. The problem is as follows:
I want the loss of each sentence. If I have 3 sentences, each with 10 tokens, the logits have size
[3, 10, V], where
V is my vocab size. The labels have size
[3, 10], basically the correct labels for each of the 10 tokens in each sentence.
How can I get the cross entropy of each sentence then? If I do
reduction='mean', I am going to get the overall mean loss (1 number). If I use
reduction='none', then I get one number for each token, so basically the loss of each single token. The code I am using is
loss_fct = nn.CrossEntropyLoss(reduction=‘mean’) # or ‘none’
masked_lm_loss = loss_fct(outputs.logits.cpu().detach().view(-1, V), target_ids.view(-1))
Perhaps I need to define the views differently in the second line?
Thanks in advance.