How to know if an operation is blocking/synchronizing or not?

Where do we find documentation on whether an operation such as cuda_tensor.item(), cuda_tensor.numel(),"cpu"), is blocking or not? (CUDA semantics — PyTorch 2.0 documentation).


You can set torch.cuda.set_sync_debug_mode(mode) to “warn” or “error” and execute the desired line of code.

1 Like


Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.set_sync_debug_mode("warn")
/opt/conda/envs/implicit-pg/lib/python3.10/site-packages/torch/cuda/ UserWarning: Synchronization debug mode is a prototype feature and does not yet detect all synchronizing operations (Triggered internally at /opt/conda/conda-bld/pytorch_1682343967769/work/torch/csrc/cuda/Module.cpp:830.)
>>> t = torch.arange(10, device="cuda")
>>> z = t**2
>>> z.numel()
>>> x = z.float().mean()
>>> m = x.item()
<stdin>:1: UserWarning: called a synchronizing CUDA operation (Triggered internally at /opt/conda/conda-bld/pytorch_1682343967769/work/c10/cuda/CUDAFunctions.cpp:148.)