How can multiple processes write to a tensor at the same time?

Can I use _share_memory() and just write to different indices:

tensor = torch.randn([5,5])._share_memory()

And then from multiple processes:

tensor[process_id] = process_id

Is there a better way involving all_reduce or something? I want non-blocking writes, as in just hogwild writes from multiple processes in parallel.