Program fails while trying to broadcast a simple tensor - any idea why?
def checker(r):
if rank == r:
tensor = torch.tensor(0, device="cuda")
else:
tensor = torch.tensor(1, device="cuda")
torch.distributed.broadcast(tensor, r)
nvidia-smi returns
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.37 Driver Version: 396.37 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro GV100 Off | 00000000:1A:00.0 Off | Off |
| 38% 47C P2 40W / 250W | 2435MiB / 32508MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Quadro GV100 Off | 00000000:67:00.0 Off | Off |
| 43% 52C P2 37W / 250W | 11MiB / 32508MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+