Torch.multiprocessing for model inference

I am trying to use torch.multiprocessing for inference.

I followed tutorial from pytorch documentation: https://pytorch.org/docs/master/notes/multiprocessing.html#sharing-cuda-tensors
My code is below

import torch.multiprocessing as mp

model = MyModel(num_classes=2)
model.load_state_dict(torch.load(path_to_model'))
model = model.to(device)
model.eval()

def inference(model):
    torch.set_num_threads(1)
    results = []
    for img in imgs*100:
        result = model(img)
        print(result[1].data)
        results.append(result[1].data.cpu())
    torch.set_num_threads(1)
    return results

processes = []

if __name__ == '__main__':
    mp.set_start_method('spawn', force=True)
    model.share_memory()
    
    st = time.time()
    for i in range(4):
        p = mp.Process(target=inference, args=(model,))
        p.start()
        processes.append(p)
    
    for p in processes: p.join()
    print(time.time() - st)

It launches, but works too fast if model even didt get the images to infer.

So, Iā€™m trying to get results from function inference using multiprocessing.

What am I doing wrong?

1 Like

Have you have any idea now? Is it faster to use multiprocessing on inference?

I get confuse on this to and below topic may help
Multiprocessing CUDA memory

Nope, but I decided to move forward of multiple instances of microservices. MPI + gunicorn

Hey @assyl_coop, can you further explain this MPI + gunicorn solution? I need to solve the same problem. I am also trying to use multiprocessing for inference using CPU