Where do we find documentation on whether an operation such as cuda_tensor.item()
, cuda_tensor.numel()
, cuda_tensor.to("cpu")
, is blocking or not? (CUDA semantics — PyTorch 2.0 documentation).
Thanks!
Where do we find documentation on whether an operation such as cuda_tensor.item()
, cuda_tensor.numel()
, cuda_tensor.to("cpu")
, is blocking or not? (CUDA semantics — PyTorch 2.0 documentation).
Thanks!
You can set torch.cuda.set_sync_debug_mode(mode)
to “warn” or “error” and execute the desired line of code.
Awesome!!
Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.set_sync_debug_mode("warn")
/opt/conda/envs/implicit-pg/lib/python3.10/site-packages/torch/cuda/__init__.py:762: UserWarning: Synchronization debug mode is a prototype feature and does not yet detect all synchronizing operations (Triggered internally at /opt/conda/conda-bld/pytorch_1682343967769/work/torch/csrc/cuda/Module.cpp:830.)
torch._C._cuda_set_sync_debug_mode(debug_mode)
>>> t = torch.arange(10, device="cuda")
>>> z = t**2
>>> z.numel()
10
>>> x = z.float().mean()
>>> m = x.item()
<stdin>:1: UserWarning: called a synchronizing CUDA operation (Triggered internally at /opt/conda/conda-bld/pytorch_1682343967769/work/c10/cuda/CUDAFunctions.cpp:148.)
>>>