Hyperparameter Sweeping by Only Measuring Convergence

One very simple question I have found in my research that I haven’t been able to fully answer revolves around hyperparameter sweeping.

Of course, there is the method of searching the entire hyperparameter space; however, this becomes computationally infeasible when considering the scale of that space.

The natural next answer people propose is using some package such as Optuna, which uses statistical estimators to provide some sort of “supervision” to the process, yet often I find that I still have to train my models until convergence (early stopping based on a validation loss) then use the resultant test metric as the optimization criterion for Optuna.

But the problem still requries full training of each iteration of hyperparameters. I am wondering, in cases where models are massive, if anyone knows of a quicker way to essentially “assess” the performance of a certain combination of hyperparameters.

For example, could I only train for 2-5 epochs on a given combination then assess the rate of convergence and use that in my optimization rather than full training and assessment on a validation loss?

I’m curious if anyone has any ideas!