Resuming Training with New Parameters

Hi all

Assume that we are performing hyper-parameter optimization(grid or random search for example). In the first trial, we grab a set of parameters(learning rate, batch size etc) , train and save the model. In the second trial, if we grab different set of parameters, load previously saved model and start training, is it reasonable to expect better final performance from the second trial? This seems to be just like transfer learning.
If so, then I can save the best model in each trial and start training in the next trial by loading this best model. I tried on a simple binary classification and it seems it is working. So my question, is it reasonable to start the next trial (having different parameters) on the top of the previously saved ‘best’ model?

No, I don’t think so as it would invalidate your hyperparameter search.
The goal of a grid or random search of hyperparameters is to find the best (or any suitable) set of these parameters and use them to train the actual model.
In your use case each new run wouldn’t give you any information about the performance of this particular hyperparameter set since the model was already pretrained in the last run.

1 Like