Run different models on different GPUs in parallel


My question is not about training. I am writing a demo code with two models I have already trained.
To speed up the code, I want to load the two different models on two different GPUs, and send different inputs to each model at the same time for concatenating the result as output.

I have tried to use torch.multiprocessing, but I got RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method.

Then I added mp.set_start_method('spawn'), but I still got RuntimeError: context has already been set

My code structure is like:

def function(queue, img, net, gpu_id):
    y = net(Variable(torch.from_numpy(img).permute(2,0,1).unsqueeze(0)).cuda(gpu_id))

if __name__ == '__main__':
    net1 = net1.cuda(0)
    net2 = net2.cuda(1)
    queue = mp.Queue()
    process1 = mp.Process(target=function,args=(queue,img1,net1,0))
    process2 = mp.Process(target=function,args=(queue,img2,net2,1))
    output1 = queue.get()
    output2 = queue.get()
    output = np.concatenate((output1, output2), 0)

How can I implement multiprocessing on multi GPUs correctly?
Thanks a lot!!

1 Like

Hi. The context needs to be set at the very beginning of the file you are executing. And you also need to execute your file from a terminal. From my experience, setting a spawn contex on a jupyter notebook won’t work. So your file needs to begin with something like that :slight_smile:

import torch.multiprocessing as mp
if __name__  ==  '__main__' :
#do other imports, etc