Torch distributed in different streams

Hi, experts

  1. We found 2 strange things when using torch distributed:
    OP always run in a separate stream (not current stream), but the code shows it chooses default stream when async_op is false.
  2. 2.Sometimes we found 2 OPs can even run in 2 separate streams (neither in current stream).

Really appreciate your help!