I am trying to run inference of multiple models parallelly on CUDA. My use case is to run them on the same input image. I get the following error.
RuntimeError: CUDA error: initialization error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Even with CUDA_LAUNCH_BLOCKING=1
I get RuntimeError: CUDA error: initialization error
A MWE would be
import torch
from torch import nn
import torch.multiprocessing as mp
from torchvision import models
if __name__ == '__main__':
num_processes = 2
input_image_pytorch = torch.randn((1,3, 256, 256), requires_grad=False, device="cuda")
processes = []
for rank in range(num_processes):
model = models.resnet18().eval().to("cuda")
p = mp.Process(target=model, args=(input_image_pytorch,))
p.start()
processes.append(p)
for p in processes:
p.join()