In my research (RL), the model is often quite small, it only utilizes 10-20% of GPU power. When doing things like hyperparameter search which we need to train the network with different configurations, is it okay to open several multiprocessing.Process and train them in parallel in a single GPU, given that the GPU memory is enough (quite small network. )
Note that this is different that hogwild training where each process share the same model to update parameters asynchronously.