Is there a gold standard of how much percentage of the data should be used as the final blind test?

And the rest be used for training.
How did you guys decide this? Thanks.

There is no golden recipe, but a train/dev/test split of 80%/10%/10% is not uncommon.

Note that the test performance is a random variable, so the reliability of the estimate strongly depends on the sample size. On the other hand, if 10% is small for a test set, chances are that the 80% are small for a training set, too.
Depending on your data, you might want to stratify the splitting.

Best regards

Thomas

1 Like