Classification result converges

I’m trying to classify 41 categories (40 + background) by using LSTM.
A loss function I used is Cross entropy.
To avoid a bias, the number of background is strongly limited to 1 or 2 while the size of batch is 16.
The result always like
[epoch 1]
Target label : [17 0 24 24 26 … 29]
Result label : [1 2 5 6 8 11 7 4 … 33]
Loss : 3.84312…

[epoch 20]
Target label : [17 0 24 24 26 … 29]
Result label : [0 0 16 0 21 0 0 0 … 0]
Loss : 3.72326…

[epoch 40]
Target label : [17 0 24 24 26 … 29]
Result label : [0 0 0 0 0 0 0 0 … 0]
Loss : 3.72885…

Label 0 is the index of background.
Also a loss always converges to around 3.7 with those 41 categories.
What’s wrong with me?