Nll loss and crossentropy loss not decreasing

Apologies for the long post. I am very new to pytorch. And really confused about crossentropy loss and nll loss in pytorch.
I am trying to use softmax on a sequence to sequence problem. I have targets of two types. First target has number of tokens of 13 and second one has 100. The model has two outputs and i am computing loss for each type and sum them. I have the model output shape [seq_length, number of tokens, embedding dim] and output shape as [seq_length, number of tokens]. For nll_loss i am using log softmax in output dimension 1 then passing output to loss function by transposing last two dimensions like [seq_length, number of tokens, embedding dim]. Also in crossentropy i am passing raw logits of shape [seq_length, number of tokens]. But each case loss just fluctuates and does not decrease. What I am doing wrong here.

A short description of full process is as follows.
Inputs are a sequence of tuples like [(1,15), (2,27), (13,10)]. First i one hot encoded each token in each tuple. Then embed them so that first elements of each tuple have shape [10, embed dim], then add the two tensors of each tuple to create one embedding for each tuple. finally stack these to create the input of shape [3, 110, embed dim] And the two output logits have shape [3, 10, embedding dim] and [3, 100, embedding dim]. And if the target is ([2,27), (12,10), (5, 90)]. They are onehot encoded separately so the target 1 has shape [3, 10] and target shape is [3,100]'