There are 4 submodels x1, x2, x3 and x4 in my training, which are trained separately. In my script, I need to call F(x1,y), F(x2,y), F(x3,y) and F(x4,y) (F is the same function for x1, x2, x3 and x4, and y is a common variable) one by one every time to do some computation and update their weights. But the speed is slow. I wander if there is a way to call F(x1,a), F(x2,a), F(x3,a) and F(x4,a) in a way like multiprocessing for speed-up, since they use the same function F.
If every operation they do is big enough, it will already be parallelized at low level. So you won’t get much benefit of forcing them to run in parallel.
Use this code for reference for parallelism on CPU.
from multiprocessing import Process, Manager # Common function def my_func(x, output_dict): out = x**2 output_dict[x] = out input = [1, 2, 3, 4] # Create manager for shared variable (output_dict) manager = Manager() output_dict = manager.dict() # Instantiate process with arguments procs =  for i in input: proc = Process(target=my_func, args=(i, output_dict)) procs.append(proc) proc.start() # Complete the processes for proc in procs: proc.join() print(output_dict.items()) # [(1, 1), (2, 4), (3, 9), (4, 16)]