My Input tensor Looks like torch.Size([8, 23])
8 - batch size, with 23 words in each of them
My output tensor Looks like torch.Size([8, 23, 103])
8- batch size, with 23 words predictions with 103 vocab size.
I want to calculate sparse cross Entropy Loss for this task, but I can’t since PyTorch only calculates the loss single element. How can I code it to work? Thanks for your help.