Hi, I know this is 2.5 years old, but did you ever find a solution for this? I believe I am doing the exact same thing as you: evolving a CNN and training each model on its own process in the multiprocessing package. CUDA is initializing for each one and drastically reducing performance. Any tips that you can remember?