Best practice for testing a pre-trained model without data leakage?

Hi everyone,

I’m looking for the best practice on how to test a fine-tuned model loaded from HF Hub properly.

Specifically, I’m using the cm93/resnet18-eurosat model, which was fine-tuned on the EuroSAT dataset. My goal is to verify its performance on my own machine.

My core concern is data leakage. I know I shouldn’t test the model on data it was trained on. However, if I load the original dataset and create a test split, how can I be certain that these test samples weren’t already used during the original fine-tuning process by the author? I’m worried I might accidentally get an inflated score that doesn’t reflect real-world performance.

What is the standard, industry-accepted procedure here?

Thanks in advance!

The dataset is split into training, validation, and test in the HF repo and I would assume the author of the pretrained model would respect these dataset splits.

I’m not familiar with a method verifying if no data leak happened.

So would I just test it on the test set, assuming the model was not trained on them? I apologise for the trivial question, I’m relatively new to the workflow.

Yes, you could test the model (without training) on the test set and should see the same reported test metric.

1 Like