What is the best way to perform hyper parameter search in PyTorch?

I am a bit skeptical of methods like grid and random search. It is nice to try them but I think experience is key in hyperparameter fine-tunning. These methods are not that good when your training takes 1 week and you do not have a server with 100’s of gpus.
For example, taking a better optimizer that converges faster is a cheaper and better way to optimize your training. Also, take for instances the batch size, a 32 batch size in a CNN will tend to perform better than a 4 or 8 batch size (at least in the dataset I am working on).
Experience plays a big role I guess.

3 Likes