What is the best practice for finding the best set of hyperparameters in PyTorch?

What is the best practice for finding the best set of hyperparameters in PyTorch?

It feels that the parameter space is so huge that one could get lost while trying to manually adjust them.

Using grid-approach for hyperparameters is extensively long procedure.
It also feels that random grid, where a set of hyperparameters randomly sampled would also take a lot of time for, say, 20 different hyperparameters, especially when datasets are big, so one would wait quite a while while different versions of a neural network are being trained for each of the hyperparameter sets.

Is there a systematic approach (or the best accepted practice) to this problem in the cases of big datasets and a large hyperparameter space, so it would still be time efficient?

You might want to check Ray Tune

1 Like

@suraj.pt Thanks! I will check that out!

I discovered three practices so far:

  • Bayesian optimization
  • Random search
  • Genetic algorithm

Any more that can be added to the list?

Maybe this paper can help:

1 Like

Other suggestions include:

Might be useful:

Thanks for this interesting resource!

One more suggestion is to use

  • Optuna