Hi,
I am trying to implement token level tagging model. I have two sentences with 14 and 9 words. I’ve padded the shorter one with utils.rnn.pad_sequence
function and I obtained log probabilities (with nn.LogSoftmax()
) for each word:
pred_logits = self.log_softmax( feats)
# print(pred_logits.shape) => 2x14x2 (bs x max_token_num x size_of_label_set)
my target array is:
target = torch.tensor([[1,0,1,1,0,1,1,0,1,0,1,1,0,1],[1,1,1,0,1,1,1,0,1,1,1,1,1,1]])
mask = torch.tensor([[1,1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,0,0,0,0,0]])
Now I want to calculate the loss by using pred_logits and target tensors as follows:
loss = self.nll_loss(pred_logits, target)
# average/reduce the loss according to the actual number of of predictions (i.e. one prediction per token).
loss /= mask.float().sum()
return loss
however, I am getting
ValueError: Expected target size (2, 2), got torch.Size([2, 14]) error.
How could I fix this problem ?
EDIT
I can do it as follows without getting any error but I wonder if there is a better way to do the same thing:
loss = 0
for i in range(0,target.shape[0]):
loss_tmp = self.nll_loss(pred_logits[i,:,:], target[i,:])
loss += loss_tmp
loss /= mask.float().sum()
return loss