dfalbel
(Daniel Falbel)
October 10, 2022, 7:40pm
1
I know that forking is not supported when using CUDA eg:
But there are some constrained scenarios where forking is possible, eg:
opened 02:37PM - 13 Dec 21 UTC
closed 01:53AM - 12 Feb 22 UTC
module: multiprocessing
module: autograd
triaged
### 🚀 The feature, motivation and pitch
I'm aware that [autograd cannot be us… ed in combination with forking, because it uses thread pools](https://github.com/pytorch/pytorch/wiki/Autograd-and-Fork).
However, it would be nice if autograd could be used both before and after forking, as long as they don't share any information.
I've written a minimal example:
```
import multiprocessing as mp
import torch
def train():
print("Training started")
x = torch.Tensor(0)
x.requires_grad = True
x.sum().backward()
print("Training ended")
if __name__ == "__main__":
print("Entered main")
train()
worker = mp.Process(target=train)
worker.start()
```
Expected output:
```
Entered main
Training started
Training ended
Training started
Training ended
```
Real output:
```
Entered main
Training started
Training ended
Training started
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "mwe.py", line 9, in train
x.sum().backward()
File "/home/christopher/.local/share/virtualenvs/src-i_X_I5Sj/lib/python3.8/site-packages/torch/tensor.py", line 198, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/christopher/.local/share/virtualenvs/src-i_X_I5Sj/lib/python3.8/site-packages/torch/autograd/__init__.py", line 98, in backward
Variable._execution_engine.run_backward(
RuntimeError: Unable to handle autograd's threading in combination with fork-based multiprocessing. See https://github.com/pytorch/pytorch/wiki/Autograd-and-Fork
```
The above example did work in PyTorch 1.4 but stopped working in 1.5.
### Alternatives
If this feature won't be added, I recommend to extend the documentation and to explicitly state that autograd will fail even if there is no sharing of information across threads.
### Additional context
This unsupported behavior seems to be a bit confusing, raising at least two questions on StackOverflow (second one by me):
- https://stackoverflow.com/questions/63081486/pytorch-multiprocessing-error-with-hogwild
- https://stackoverflow.com/questions/70308084/using-pytorch-in-multiple-independent-forked-threads
The error message is raised in https://github.com/pytorch/pytorch/blob/065018d8120567976b5f9296e56c3ece5db51e12/torch/csrc/autograd/engine.cpp#L1047-L1053
cc @VitalyFedyunin @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @Lezcano @Varal7
I wonder if there are some recommendations for using fork with MPS enabled builds of pytorch. Or when using MPS tensors.
FWIW I have tried forking with 3 simple different scenarios:
Creating an MPS tensor:
def mps_tensor ():
torch.randn(100, 100, device = "mps")```
Creating a tensor on the CPU
def cpu_tensor ():
torch.randn(100, 100, device = "cpu")
Cpu but calling backward:
def cpu_but_backward():
x = torch.randn(100,100)
y = torch.randn(100,100, requires_grad=True)
loss = torch.mm(x, y).sum()
loss.backward()
I then tried forking the process and running those functions with:
import multiprocessing
from multiprocessing import Process
multiprocessing.set_start_method("fork")
p = Process(target=mps_tensor)
p.start()
The 1 and 3 functions failed with:
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
objc[60703]: +[NSPlaceholderMutableString initialize] may have been in progress in another thread when fork() was called.
objc[60703]: +[NSPlaceholderMutableString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
while 2 works normally.