Cross entropy loss for sentence classification does not improve accuracy

I’m trying to use CrossEntropyLoss criterion for sentence classification. The task is basically a multi-classification problem where each sentence can be of only one class (given three classes). I’m having trouble knowing which approach I should follow when calculating the loss.

I’m using masking for the instances as well. The problem is that the loss is decreasing till some points, so I expect the accuracy gets elevated, but it does not in practice. I’m now wondering if there’s something wrong with my intuition (and approach) to calculate and backpropagate through the computed loss:

self.loss_sect = torch.nn.CrossEntropyLoss(reduction='none')
loss_sect = self.loss_sect(sent_sect_scores.permute(0, 2, 1), sent_sect_labels)
loss_sect = (loss_sect * mask.float()).sum(dim=1).mean()

Above, sent_sect_scores is a tensor in the shape of: [batch_size, sentence_num, section_scores] where section_scores are the outputs from the neural network with leakyReLu activation funtion. sent_sect_labels has the shape of [batch_size, sentence_num] containing the labels associated with each sentence (i.e., labels between 0 and 2). Is it a correct proactive to compute the loss? Then why isn’t the accuracy improving in my training?

PS: I also used the following snippet, but it didn’t help improving accuracy neither:

loss_sect = self.loss_sect(sent_sect_scores.permute(0, 2, 1), sent_sect_labels)
loss_sect = (loss_sect * mask.float()).sum(dim=1)
loss_sect = (loss_sect / torch.sum(mask, dim=1).float()).mean()

And this is the log report of some training steps:

[2020-05-16 00:10:38,775 INFO] Step 50/152000; xent_sect: 9.72 (ACC: 0.4217)
[2020-05-16 00:11:06,709 INFO] Step 100/152000; xent_sect: 9.00 (ACC: 0.3430)

[2020-05-16 00:14:21,781 INFO] Step 450/152000; xent_sect: 6.10 (ACC: 0.2967)
[2020-05-16 00:14:49,663 INFO] Step 500/152000; xent_sect: 6.68 (ACC: 0.3027)
[2020-05-16 00:15:17,563 INFO] Step 550/152000; xent_sect: 6.55 (ACC: 0.3126)
[2020-05-16 00:15:45,208 INFO] Step 600/152000; xent_sect: 6.26 (ACC: 0.3179)