The multiprocessing best practices in the documentations states:
“The CUDA runtime does not support the
fork start method; either the
forkserver start method are required to use CUDA in subprocesses”
Does this mean that I can’t write a ddp training script that works on gpus with ‘fork’?
I haven’t found a clear answer for this and I’m not sure what CUDA runtime means in the docs. In my specific use case, I kinda have to use ‘fork’ so I can pass object like data with shared memory.
If so, what are the limitations of using mp.Process with fork method?