Is there any approach in pytorch to achieve distributed network connection between CPU prediction and GPU training?

bqdqj · September 10, 2020, 1:18am

Out team is planning to use CPUs from multiple computers to do network prediction and data production, and then use a single GPU server to do network training. Is there any method we can use in torch.distributed package that can help us with this situation?

mrshenli · September 10, 2020, 2:49am

torch.distributed.rpc should be able to help. Here is a list of tutorials.

The use case looks similar to the following two examples:

bqdqj · September 11, 2020, 7:31am

Thank you very much, I’ll look into them carefully!