I’m having trouble with multiple processes working on the same GPU. I wrote minimal error-reproducing example.
I ran the example code successfully on my local machine, using CUDA 10.2 and pytorch 1.2.0.
While this works just fine, it fails to run on a cluster with CUDA 10.1 and pytorch 1.2.0.
Does anybody know why or how to overcome this? Thanks a ton.
import torch.multiprocessing as _mp import torch import os import time import numpy as np mp = _mp.get_context('spawn') class Process(mp.Process): def __init__(self, id): super().__init__() print("Init Process") self.id = id return def run(self): os.environ['CUDA_VISIBLE_DEVICES'] = '0' for i in range(3): with torch.cuda.device(0): x = torch.Tensor(10).to(0) x.to('cpu') del x time.sleep(np.random.random()) if __name__ == "__main__": num_processes = 2 os.environ['CUDA_VISIBLE_DEVICES'] = '0' processes = [Process(i) for i in range(num_processes)] [p.start() for p in processes] [p.join() for p in processes]
Process Process-2: Traceback (most recent call last): File "/cluster/home/marksm/software/anaconda/envs/test/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/cluster/home/marksm/mp_demonstration.py", line 20, in run x = torch.Tensor(10).to(0) RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable