Hi everyone,
I’m looking for the best practice on how to test a fine-tuned model loaded from HF Hub properly.
Specifically, I’m using the cm93/resnet18-eurosat
model, which was fine-tuned on the EuroSAT dataset. My goal is to verify its performance on my own machine.
My core concern is data leakage. I know I shouldn’t test the model on data it was trained on. However, if I load the original dataset and create a test split, how can I be certain that these test samples weren’t already used during the original fine-tuning process by the author? I’m worried I might accidentally get an inflated score that doesn’t reflect real-world performance.
What is the standard, industry-accepted procedure here?
Thanks in advance!