For reference, I ultimately decided to directly integrate asyncio with pytorch RPC to get around this issue. I describe how I did it in this post: Pytorch Distributed RPC bottleneck in _recursive_compile_class - #9 by jeremysalwen
This idea of using @rpc.functions.async_execution
is interesting, but to me the example looks like itβs begging to be a coroutine instead of a pair of functions chained together with a callback