I’m using torch.distributed with MPI and would like to share a CUDA tensor across my MPI world.
From https://pytorch.org/docs/stable/multiprocessing.html#sharing-cuda-tensors it looks like shared CUDA memory is possible with torch.multiprocessing
How can I do the same with torch.distributed?
Thanks!