How to limit torch.compile to CPU only?

Hi,

I am using torch.compile for optimizing some preprocessing components, which are run on CPU, like depth_to_3d_opt = torch.compile(kornia.geometry.depth_to_3d, mode="reduce-overhead")

Then I am trying to run multi-GPU training via transformers + accelerate.
However, the torch.compile fails with CUDA-related error torch._dynamo.exc.InternalTorchDynamoError: Cannot re-initialize CUDA in forked subprocess.

[rank3]:   File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 172, in _fn
[rank3]:     cuda_rng_state = torch.cuda.get_rng_state()
[rank3]:   File "/opt/conda/lib/python3.10/site-packages/torch/cuda/random.py", line 31, in get_rng_state
[rank3]:     _lazy_init()
[rank3]:   File "/opt/conda/lib/python3.10/site-packages/torch/cuda/__init__.py", line 300, in _lazy_init
[rank3]:     raise RuntimeError(
[rank3]: torch._dynamo.exc.InternalTorchDynamoError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method


[rank3]: You can suppress this exception and fall back to eager by setting:
[rank3]:     import torch._dynamo
[rank3]:     torch._dynamo.config.suppress_errors = True

Is there a way to force torch.compile to run in CPU-only mode?

1 Like

What version of pytorch is this?

Latest stable, 2.4.0

1 Like

I faced exactly the same issue. Can anyone help? Thank you!