Reviewing statistical results for Convolutional seq2seq model

Dear friends,

I’m a little worried about my statistical results after training model. I’m working on some NLP task and the test set contain 50k word. the final result is :

Test Loss: 0.000 | Test accuracy: 0.998 | Test precision: 0.891 | Train recall: 0.891
Test F1: 0.891

I have checked the code, every thing is working well, but I think this is little high score. I calculated the loss like :

        for i, batch in enumerate(iterator):

            src = batch.src
            trg = batch.trg
            output, _ = model(src, trg[:,:-1)
            output = output.contiguous().view(-1, output.shape[-1])
            trg = trg[:,1:].contiguous().view(-1)

            loss = criterion(output, trg)

            epoch_loss += loss.item()

Kindly, anyone can explain this, and how to recheck it?

I’m not sure, how you’re calculating the metrics, but if you have some doubts about these high scores, I would recommend to check for data leaks.
Make sure you are not using the test data for training in any sense (e.g. early stopping, hyper parameter search etc.).

Thank you Mr. @ptrblck, I will check it again :slight_smile: