How should I reshape output of log_softmax so that I can use it in nn.NLLLoss()


I am trying to implement token level tagging model. I have two sentences with 14 and 9 words. I’ve padded the shorter one with utils.rnn.pad_sequence function and I obtained log probabilities (with nn.LogSoftmax()) for each word:

pred_logits = self.log_softmax( feats)
# print(pred_logits.shape) => 2x14x2 (bs x max_token_num x size_of_label_set)
my target array is:
target = torch.tensor([[1,0,1,1,0,1,1,0,1,0,1,1,0,1],[1,1,1,0,1,1,1,0,1,1,1,1,1,1]])
mask = torch.tensor([[1,1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,0,0,0,0,0]])

Now I want to calculate the loss by using pred_logits and target tensors as follows:

 loss = self.nll_loss(pred_logits, target)
 # average/reduce the loss according to the actual number of of predictions (i.e. one prediction per token).
 loss /= mask.float().sum()
 return loss

however, I am getting

ValueError: Expected target size (2, 2), got torch.Size([2, 14]) error.

How could I fix this problem ?


I can do it as follows without getting any error but I wonder if there is a better way to do the same thing:

loss = 0
for i in range(0,target.shape[0]):  
    loss_tmp = self.nll_loss(pred_logits[i,:,:], target[i,:])
    loss += loss_tmp
loss /= mask.float().sum()
 return loss