Testing Accuracy has no significant improvement unlike training accuracy

Model training accuracy improves well (reaches 90 - 100%) but testing accuracy seems not to having significant improvement at all even after adding NNModel.eval() before testing.
Could this be due to a bug in the code I’m not seeing or…
I’ve tried increasing the training data to even 100,000 samples, but still the same issue.

#train + test model
    def run(self, epochs, train_batch_size):
        correct_predictions, total_targets = 0, 0
        for epoch in range(epochs):
            print(f'epoch: {epoch}')
            batch_losses = list()
            batch_accuracies = list()
            for idx in tqdm(range(0, len(self.train_df), train_batch_size)):
                X, y = self.__data_loader(self.train_df, start=idx, stop=idx+train_batch_size)
                pred = NNModel(X.to(device))
                loss = lossFunc(pred, y.long().to(device))
                _, pred = torch.max(pred, 1)
                correct_predictions += (pred.to('cpu').detach() == y.detach()).sum().item()
                total_targets += len(y)
                batch_accuracy = (correct_predictions/total_targets)*100
            mean_batch_loss = np.mean(np.array(batch_losses))
            mean_batch_accuracies = np.mean(np.array(batch_accuracies))
            print(f'mean_batch_error: {mean_batch_loss} \n mean_batch_accuracy: {mean_batch_accuracies}%')
            correct_predictions, total_targets = 0, 0
            if(epoch%10 == 0):
        print('model saved successfully...')
    #test model
    def __test(self, test_batch_size):
        test_losses = list()
        correct_predictions, total_targets = 0, 0
        with torch.no_grad():
            for idx in tqdm(range(0, len(self.test_df), batch_size)):
                X, y = self.__data_loader(self.test_df, start=idx, stop=idx+test_batch_size)
                pred = NNModel(X.to(device))
                loss = lossFunc(pred, y.long().to(device))
                _, pred = torch.max(pred, 1)
                correct_predictions += (pred.detach().to('cpu') == y.detach()).sum().item()
                total_targets += len(y)
                test_accuracy = (correct_predictions/total_targets)*100
            print(f'test_error:{np.mean(np.array(test_losses))} test_accuracy: {test_accuracy}%')

Your model is most likely overfitting. That is, it is essentially memorizing the input-output relations in the training data, rather than learning the “shape” of the training data. So it does very well on the training data, but does badly on the test data in comparison.

You could try various regularization methods to try to get around overfitting. Some common regularization methods are adding Dropout layers to your model, or adding a weight decay parameter to y our optimizer.

Thanks for the reply.
Yes I’ve thought of this and I’ve tried adding dropout layers set to drop even 50% of neurons, still no much improvement
The dropout layers just prevent the training process from converging faster than usual
I’ll check out the other regularization means

Try data augmentation for training data diversity. Also, your network maybe shallow. Try deepen your network (i.e., add some more layers) and see what happens.

My data is a time domain signal data that I converted to frequcy domain via Discrete Fourier Transform.
So I don’t really think data augmentation is necessary.

Also I’ve tried deepening my layers and I’ve tried making them even shallow too.