Distributed Applications with shared memory support

I am writing an applications that requires transmitting objects between processes. However, the transmission time is high due to high latency of network. I am wondering if theres a way to let these operations, for example, torch.reduce(), torch.gather(), have shared memory support?

thank you very much.