Which best model I need to save

lycaenidae · May 13, 2020, 4:43pm

Hi

Do I need to save best model on training set, or best model on validation set?

futscdav · May 13, 2020, 4:46pm

Provided you have a representative validation set, it’s normal practice to consider the model which performs best on validation set to be better. Note that if you are comparing many models on a small validation set, you run the risk of basically comparing noisy random variables.

lycaenidae · May 13, 2020, 5:04pm

Is it a really bad to save the model which is best on the train set?

futscdav · May 13, 2020, 5:09pm

It’s not bad per se. But usually, you care about performance on unseen data (which are what validation set is supposed to represent), not so much about performance on specific training dataset.