Hyperparameter Optimisation - Multiple Models Multiple Cores

magnus_w · December 14, 2017, 7:44pm

Dear fellows,

I would like to know what is the best practice in training multiple models on multiple CPU Cores.
Basically, my task is hyperparameter and initial value search for my (very very) small models.

Does anybody have advise?

Br,
Magnus

richard · December 14, 2017, 8:16pm

You might want to look at torch.multiprocessing

magnus_w · December 20, 2017, 10:55pm

Hey Richard,

thx very much. Had a look at it. However, I couldn’t figure it there is a convenient way so that I do not have to build a new model/graph whenever I want to train another hyperparameter setting in parallel.

Regards,
Magnus

magnus_w · January 1, 2018, 11:03am

CPU parallelization: http://pytorch.org/tutorials/intermediate/dist_tuto.html

jpeg729 · January 1, 2018, 12:40pm

I saw that one too, but it didn’t seem to fit my use case either. My approach would be something like this…

from joblib import Parallel, delayed
results = Parallel(n_jobs=-1)([delayed(train_function)(args) for _ in range(80)]

Where train_function trains a model for a fixed number of epochs or until some stopping criterion and returns a list of validation losses per epoch (for example). When the parallel jobs are all done results is simply a list containing the return values from each run of train_function.

I haven’t tested this approach so I can’t say whether torch Tensors and Variables can be passed to train_function successfully, nor whether they can share memory properly. That said, sklearn uses joblib so I am pretty sure that numpy arrays can be passed to the train_function efficiently.