Reviewing statistical results for Convolutional seq2seq model

Dear friends,

I’m a little worried about my statistical results after training model. I’m working on some NLP task and the test set contain 50k word. the final result is :

Test Loss: 0.000 | Test accuracy: 0.998 | Test precision: 0.891 | Train recall: 0.891
Test F1: 0.891

I have checked the code, every thing is working well, but I think this is little high score. I calculated the loss like :

        for i, batch in enumerate(iterator):

            src = batch.src
            trg = batch.trg
             
            output, _ = model(src, trg[:,:-1)
            output = output.contiguous().view(-1, output.shape[-1])
            trg = trg[:,1:].contiguous().view(-1)

            loss = criterion(output, trg)

            epoch_loss += loss.item()

Kindly, anyone can explain this, and how to recheck it?

I’m not sure, how you’re calculating the metrics, but if you have some doubts about these high scores, I would recommend to check for data leaks.
Make sure you are not using the test data for training in any sense (e.g. early stopping, hyper parameter search etc.).

1 Like

Thank you Mr. @ptrblck, I will check it again :slight_smile: