Pytorch distributed concurrent queue/buffer

Hi everyone,

I am wondering whether there is anyway for pytorch distributed to build one concurrent queue(or buffer) between parameter server and workers.

So that, every worker can work as a producer to send the msg to the concurrent queue.

And the parameter server can work as consumer to consume msg from concurrent queue.

Besides, parameter server can detect the length of the concurrent queue.

Thank you!

Hey @ryuxin, can this be implemented as a wrapper on top of the RPC API? For example, can you implement the queuing logic as an RPC target function? Some related tutorials:

  1. https://pytorch.org/tutorials/intermediate/rpc_param_server_tutorial.html
  2. https://github.com/pytorch/tutorials/blob/release/1.6/intermediate_source/rpc_async_execution.rst

Thanks for the hint!